February
2001 Issue: 18
Journal of Conceptual Modeling
www.inconcept.com/jcm
A
New Automated Heuristics For Mapping
NIAM ISDs Into Fifth Normal Form
Tables
by
Toufik Taibi
Abstract
This paper is aimed to provide a detailed description of the notational and semantic aspects of NIAM's Information Structure Diagram (ISD). The ISD is the most natural way of modeling a Universe of Discourse (UoD) as it uses a carefully chosen subset of the language people use in their day to day interaction. Moreover, a tool has been developed (NIAM Mapper) to allow the drawing of ISDs, performing rigorous syntactic and semantic checking on them and mapping them into fifth normal form tables based on a new mapping heuristics.
1 Introduction
The main purpose of NIAM is to create, through a conceptual structure abstracted from the real-world system , a usable Information System (IS). This is done by progressing through several subsequent phases, with a step by step increase in detail. The analysis deliverable being the Information Structure Diagram (ISD), which is detailed enough to be directly mapped into an optimal database structure, in which all tables are in fifth normal form. NIAM was originally developed at the Control Data Holland company by G.M. Nijssen in 1974. It was made bearing in mind the importance of communication for actual IS development.
Today, the method widely known and used worldwide as Object Role Modeling (ORM) [1]. Because of the detail of the ISD, it helps to specify the information in an exact, manner that avoids any ambiguity. NIAM is predominantly meant to be used in the analysis and development phases, as it is originally developed solely as an information analysis method. However, the method has been expanded both backward (to the Universe of Discourse analysis phase) and forward (to the design and construction phase).
Thus it is clear that NIAM, in its broadest sense, is a more or less complete systems development method which starts with a Universe Of Discourse (UoD) and ends with an implemented IS.
This paper will first provide background information about NIAM ISDs, then NIAM Mapper will be introduced. NIAM Mapper is a tool used to draw ISDs, check their syntax and semantics according to NIAM's rules and map them into an optimal database structure based on a newly developed heuristics.
2. System Development Lifecycle
The paradigm behind NIAM is that the UoD can be seen as an abstract high level function representing a set of cooperating and communicating activities performed by people. Communication in an organization can be realized by using a carefully chosen part of the natural language that people use in their day to day interaction. NIAM tries to accomplish an improvement in systems development by using this natural language as a basis for analyzing every aspect of an IS.
NIAM mainly consists of three separate development phases. The first phase is the analysis of the UoD, which yields the decomposition of the UoD into a set of activities, which in turn will be decomposed into a hierarchy of smaller functions.
The second phase is analyzing the functions. Every activity will be depicted in a different Information Flow Diagram (IFD), which specify every information stream to enter or exit the lower-level functions.
In the third phase, every single information flow is analyzed further to reveal its underlying structure. This is done, by using a set of concepts: fact types, non-lexical object types, lexical object types, subtypes, and constraints. The flows are the result of functions trying to communicate, and are seen by NIAM as consisting of natural language elementary sentences called fact types. They are the shortest possible natural language description of the information flow, and contain information on objects in the UoD and possible relations between them.
This phase takes care of the design of the structure of the information system (which, necessarily, will support the information flows). The end result is a systems specification that is detailed enough to be directly mapped into an optimal database structure.
3. Information Structure Diagrams
Most of the notation that will be used was taken from [2] with some slight changes. In an ISD the Non-Lexical Object Types (NOLOTs) of the IS are represented by a circle, while Lexical Object Types (LOTs) are represented by a dotted circle. NOLOTs represent entity types for example (person) while LOTs represent their attributes for example (street).
![]()
![]()
Figure 1 Example Of Two Fact Types
When there is a relationship between a NOLOT and a LOT or other NOLOT, it has to be part of an elementary fact type that cannot be reduced to smaller statements without the loss of the facts which are in it. Two fact types (collections of similar facts), are shown, in Figure 1, with the boxes denoting a fact type carrying certain role names. In the first example of Figure1, the left box contains the name of the role from person to rental, while the right box contains the opposite role.
A fact type that only concerns NOLOTS (like the first example in Figure1), is called an idea type. A fact type that concerns one NOLOT and one LOT (like the second example in Figure1), is called a bridge type. Facts which only connect LOTS do not exist. Concrete instances of a fact type, object type or attribute are called instances.
Generally, fact types are binary. Ternary fact types are to be avoided as they can easily be modeled more effectively as a collection of binary fact types.
A possible extension of the normal binary fact types is the modeling of subtypes. Subtypes are used to model more specific instances of a general object type, where the more specific subtype can be seen as a refinement of the supertype that can have additional properties and roles. Constraints are used to exclude wasteful or impossible combinations of facts by enforcing certain values or combinations.
3.1 Constraints
3.1.1. Internal Uniqueness
Internal uniqueness (denoted by placing a line above a rolename) indicates that a given instance of the concerning object type can only appear once in that role. Figure 2, shows four kinds of internal uniqueness. The first example shows that customer and number are both unique. The second example shows that customer is unique, but lastname is not. The third example shows that car registration is unique, but customer is not. The last example shows that customers can have more than one firstname and firstnames can be the same for different customers. However, the combination of a customer and a firstname is unique.
![]()
![]()

![]()
Figure 2 Internal Uniqueness
3.1.2 Internal Completeness
Internal completeness is used to denote that for a certain role, the population contains at least one instance of the related object type. This is usually shown in a diagram by a black dot on the connecting line between object type and role as shown in Figure 3.
![]()
Figure 3 Internal Completeness
The cardinality constraint can be used to denote both the lower and upper bound of the number of times a given instance can enter in a certain role. The internal uniqueness and completeness constraints are actually just special notational versions of the cardinality constraint. Cardinality can be displayed graphically, by showing the upper and lower bounds above the role.
3.1.3. External Uniqueness
External uniqueness expresses that combinations of different object types can be unique in relation to another, common, object type.

Figure 4 External Uniqueness
In Figure 4, the encircled "U" means that the combination of longitude and latitude completely defines a city, although any one latitude or longitude can be in use for different cities.
3.1.4. External Completeness
In Figure 5, an encircled "T" indicates that a combination of two roles together define a complete population. "T" means totality, a synonym of completeness. Employees are either working on a project, or at a department, or both. The constraint can only be used on roles of the same object type, as opposed to external uniqueness which applies on roles of different object types.

Figure 5 External Completeness
3.1.5. Equality
The equality constraint is used to show that instances of object types in two or more connected roles are equals. Of course, they must be based on the same object types, otherwise they could never be seen as equal. In Figure 6, the encircled "=" indicates that in a University, every student practicing a sport is part of a team.


Figure 6 Equality 3.1.6. Subsets
The subset constraint indicates that the population of a role is a subset of a population of another role. This is depicted by an arrow pointing from the subset to the superset. Both populations need to belong to the same object type. In Figure 7, the employees working on a project are a subset of the employees working in a department.

Figure 7 Subset
3.1.7. Exclusions
Exclusion can be used when a certain object type has a fact type with more than one other object type, and each instance of first can only be part of one of the two roles. The constraint is depicted as an encircled "X". Figure 8 shows that an employee cannot simultaneously work on a project and be assigned to a department.

Figure 8 Exclusion
3.1.8. Constraints Between Fact Types
Exclusion, equality and subset constraints can also be applied between fact types. These constraints will have the same definitions mentioned above except that now they are applied on pairs of instances instead of instances in a given population. In Figure 9, "?" could be "X", "=" or an arrow.

Figure 9 Constraint Between Fact Types
3.1.9. Identification Schemes
Whenever a bridge type is one to one, that is, for example, each student has one number, and each number is only assigned to one student, and the internal completeness rule defines that each student has a number, then the number can be used as a unique reference (primary key). If such a LOT is also the preferred identifier for the concerning NOLOT, it can be displayed graphically in one of the two ways, as shown in Figure10. This is one of the identification schemes.
![]()

Figure 10 Two Ways Of Depicting The First Identification Scheme
In an ISD there are three more identification schemes which are shown in Figure 11.



Figure 11 The Three Remaining Identification Schemes
3.1.10 Constraints On Subtypes
Completeness and exclusion constraints can also be applied to subtypes as shown in the example given in Figure 12.

Figure 12 Constraints Between Subtypes
4. NIAM Mapper
Figure13 shows NIAM Mapper architecture [3]. It consists mainly of five modules. The drawing module is the main module in the tool, as it allows users to draw any ISD using the available icons for concepts and constraints. The drawing process is fully guided by a help module which also gives detailed information on NIAM concepts. The ISD is then checked extensively for syntactic and semantic errors using a dedicated module. The user can get information on the NOLOTs, their subtypes and identifiers through the information module. Finally a mapping module is used to map the checked ISDs into fifth normal form tables using a newly developed mapping heuristics.


![]()
![]()
Figure 13 NIAM Mapper Architecture
Table 1 summarizes the classes (written in C++) used by the drawing module to store information about the ISDs [3].
Table 1 Classes Used By The Drawing Module To Store Information About The ISDs
|
class
Fact_Type // This is used to store information about fact types
{
char
*Left_Object_Type_Name;// Name of the left object type
char
Left_Object_Type_Kind //
This can be either 'n' for NOLOT or 'l' for LOT
char
*Right_Object_Type_Name; //Name of the right object type
char
Right_Object_Type_Kind; // This can be either 'n' for NOLOT or
'l' for LOT
char
Internal_Completeness; // Internal completeness constraint
char
Internal_Uniqueness; // Internal uniqueness constraint
int
Fact_Nbr; // A unique number allocated to the fact type
//
Possible
values of the constraints
// '0' constraint on the whole fact type
// '1' constraint on the left role
// '2' constraint on the right role
// '3' constraint on the left and right
role
public:
//methods
}; |
|
class
Ident_List //This is used to store information about the identifiers
of a NOLOT ( a maximum of two
//identifiers is assumed)
{
char
*Name; //Name of the NOLOT
char
*Identifiers[2];// Contains its identifiers
int
Fact_Type_Nbr[2]; //Contains fact type numbers which are part of the
identification scheme
public:
//methods
}; |
|
class
Cst_Facts //This is used to store information about constraints
between fact types
{
int
Fact_Type1_Nbr; // Number of the first type fact having the constraint
int
Fact_Type2_Nbr; // Number of the second fact type having the
constraint
char
Equality; // Equality constraint
char
Subset; //Subset
constraint
char
Exclusion; //Exclusion constraint
char
External_Uniqueness;// External uniqueness
char
External_Completeness; //External completeness
// Possible values of the constraints
// '0' absence
//
'1' between left roles
// '2' between right roles
// '3' between left and right roles (in
the case of the existence of more than
//one fact type between the same two
object types)
// '4' between fact types
public:
//methods
}; |
|
class
Supertype //This is used to store information about supertypes and
their subtypes
{
char
*Supertype_Name;// Name of the supertype
char
*Subtypes_Names[5];// Names of subtypes (A maximum of five subtypes is
assumed)
public:
//methods
}; |
|
class
Cst_Subtype // This is used to store information about constraints
between subtypes{
char
*Name1;//Name of the first subtype
char
*Name2; //Name of the second subtype
char
Completeness; //Completeness constraint
char
Exclusion; //Exclusion constraint
// Possible values of the constraints '0'
absence and '1' presence
public:
//methods
}; |
Table 2 summarizes the error detected by the syntactic and semantic checking module [3].
Table 2 Errors Detected By The syntactic And Semantic Checking module
|
1.
There should be only one fact type with a given LOT
2.
Two different object types should not have the same name.
3.
There should be no fact types between LOTs
4.
There should be no internal completeness constraints in the
side of a LOT
5.
Every NOLOT should have an identifier
6.
Equality, exclusion, subset, and external completeness
constraints must operate on the same population
7.
External uniqueness constraint should not operate on the same
population
8.
Equality, subset and exclusion constraints between fact types
must operate on the same populations
9.
There should be no external uniqueness and external
completeness constraints between fact types
10.
There should be no exclusion and equality constraints between
the same roles at the same time
11.
There should be no exclusion and subset constraints between the
same roles at the same time
12.
An equality constraint between roles at the same time with a
subset constraint is redundant
13.
There should be no exclusion, equality and subset constraints
at the same time between fact types and between roles.
14.
A LOT cannot be a supertype or a subtype
15.
A NOLOT cannot be the supertype of itself
16.
The subtyping graph cannot contain loops |
5. The Mapping Heuristics
After performing syntactic and semantic checking on an ISD to make sure that it conforms to NIAM's rules, it is mapped into an optimal database structure. In addition to the above mentioned classes, the mapping heuristics uses additional classes as summarized in Table 3 [3].
Table 3 Data Structures Used By The Mapping Heuristics
|
class
central_NOLOT //A NOLOT fulfilling
certain requirements as per step 2 of the mapping
//heuristics
{
char
*Name; // Name of the central NOLOT
char
*Left_right; // Whether the NOLOT is linked to the central NOLOT to
the right or left
int
Fact_Type_Numbers[10]; //Constraints the numbers of the fact types
linked to the central
public:
//methods
}; |
|
class
Remaining_Fact_Types // This is used to store information about the
remaining fact types after
//centralization
{
int
remaining_Fact_Type_Nbr[10]; // Contains the remaining fact type
numbers after centralization, excluding the fact types which are part
of an identification scheme.
public:
//methods
}; |
Table 4 summarizes the steps of the mapping heuristics [3].
Table 4 Steps Of The Mapping Heuristics
|
1)
All fact types of the subtypes become fact types of their
supertypes
2)
The linked list of central_NOLOT objects is to be populated
with any NOLOT that fulfills the following requirements:
a)
It has more that one internal completeness constraint in the
fact types linked to it
b)
None of the fact types that fulfills (a)
has an internal uniqueness constraint on the fact type
c)
None of the fact types that fulfill (a) and (b) should be part
of an identification scheme
A
table made-up of the identifiers of the related object types is to be
created .
3)
Among the remaining fact types (excluding identification
schemes) any two fact types that fulfill the following requirements
are to be found:
a)
There is
an equality constraint between roles
b)
None of the fact types that fulfills (a) has an internal
uniqueness constraints on the fact type.
A
table made-up of the identifiers of the three object types is to be
created.
4)
Make separate tables for the remaining fact types made-up of
the identifiers of the related object types
5)
The presence of an internal uniqueness constraint from the side
of an object type implies a simple key for its identifier in the
corresponding relational table and its absence implies a composite
key.
6)
If there are more than one identification scheme involving the
same object type, then the identifier resulting from the
identification scheme which yields one identifier (first scheme) will
be chosen. |
6. Examples
Figure 14 and Figure15 show respectively a drawn generic ISD in NIAM Mapper and its corresponding fifth normal form tables [3].

Figure 14 An example ISD drawn using NIAM Mapper

Figure 15 The mapping result of an ISD drawn in NIAM Mapper
7. Conclusion
The role of a new IS is to make improvement on the way activities are performed in an organization before automating or upgrading them. Do to so, is it required to first understand the UoD and map it into a model (in this case an ISD) with as little as possible of information loss.
It has been proved in the field that using NIAM's ISD to model a UoD yields the nearest representation of the real-world concepts, interactions and constraints, as ISD uses a carefully chosen subset of the language people use in their day to day interaction. As a result the subsequent phases in the development of the IS will be very efficient and therefore fulfill the above mentioned requirements for IS.
In this paper an automated mapping heuristics that maps ISDs into fifth normal form tables has been introduced. It has been successfully tested on many ISDs and has always yielded fifth normal form tables. As it cannot be proved mathematically, it has been called a heuristics and not an algorithm. As such it will be valid as long as no example that fails it is found.
References
[1] T.A. Halpin, Conceptual Schema and Relational Database Design, revised 2nd ed. WytLytPub, 1999
[2] G.M. Nijssen, T.A. Halpin, Conceptual Schema and Relational Database Design, a fact oriented approach, Prentice Hall, Sydney, 1989.
[3] T. Taibi, T. Nouari, Design of a Graphical Interface for the drawing and the mapping of NIAM's Conceptual Schema, A thesis submitted for the partial fulfillment of the requirement of state engineer degree in computer science, Computer Science Faculty, Oran University, Algeria, October 1993
![]()
The author, Toufik Taibi, is a lecturer at the Faculty of Information Technology, Multimedia University, Malaysia. His research interests include formal specification of design patterns, object-oriented methods , software engineering and conceptual modeling,. Toufik can be reached at toufik.taibi@mmu.edu.my.
Faculty of Information Technology
Multimedia University
Jalan Multimedia, 63100 Cyberjaya
Selangor, Malaysia
Fax: 603-83125264
![]()
© Copyright, 1998-2004 InConcept
(Information Conceptual Modeling, Inc.) All
Rights Reserved. Privacy Statement.
ISSN: 1533-3825