July
2002 Issue: 26
Journal of Conceptual Modeling
www.inconcept.com/jcm
Microsoft’s new
database modeling tool: Part 7
by Terry Halpin
Microsoft Corporation
Abstract: This is the seventh in a series of articles introducing the Visio-based database modeling component of Microsoft Visual Studio .NET Enterprise Architect. Part 1 showed how to create a basic ORM source model, map it to a logical database model, and generate a DDL script for the physical database schema. Part 2 discussed how to use the verbalizer, make an object type independent, objectify an association, and add some other ORM constraints to an ORM model. Part 3 showed how to add set-comparison constraints (subset, equality and exclusion) and how exclusive-or constraints combine exclusion and disjunctive mandatory constraints. Part 4 discussed the basics of modeling and mapping subtypes. Part 5 discussed mapping subtypes to separate tables, and occurrence frequency constraints. Part 6 discussed ring constraints. Part 7 discusses index constraints, constraint layers, and data types.
Introduction
This is the seventh in a series of articles introducing the database modeling solution in Microsoft Visio for Enterprise Architects, which is included in the Enterprise Architect edition of Visual Studio. NET. This article discusses how to add index constraints to an ORM model, show/hide constraints via layers, and specify data types. Familiarity with ORM and relational database modeling is assumed. For an overview of ORM, see [1]. For a thorough treatment of ORM and database modeling, see [2]. For previous articles in this series, see [3], [4], [5], [6], [7] and [8].
Consider the ORM schema shown in Figure 1(a). Here each person is identified by a social security number (ssn), has a surname, at most two given names, and was born in at most one country. Although rare, it is possible that a person has no given names, so the given name fact types are optional. In real life, each person was born in a country, but the information system makes it optional to record a person’s birth country. Each country is identified by a country code (e.g. ‘AU’), but also has a unique, identifying name (e.g. ‘Australia’).

(a)
(b)
Default mapping of a simple ORM schema
Previous articles discussed how to enter an ORM conceptual schema, map it to a logical database schema, and generate the DDL script for the physical database schema. By default, the sample ORM schema maps to the relational schema shown in Figure 1(b). The “PK” annotation indicates primary keys, “FKn” indicates foreign keys, and “Un” indicates uniqueness constraints. If you choose SQL Server 2000 as the target system, and generate the DDL script for this schema, the following DDL code is generated (I’ve removed the generated comments to save space).
create table "Person" (
"Person ssn" char(10) not null,
"Surname" char(10) not null,
"First GivenName" char(10) null,
"Born Country code" char(10) null,
"Second GivenName" char(10) null)
alter table "Person"
add constraint "Person_PK" primary key ("Person ssn")
create table "Country" (
"Country code" char(10) not null,
"CountryName" char(10) not null)
alter table "Country"
add constraint "Country_PK" primary key ("Country code")
create unique index "Country_AK1" on "Country" (“CountryName")
alter table "Country" add constraint "Country_AK1_UC1" unique ("CountryName")
alter table "Person"
add constraint "Country_Person_FK1" foreign key ("Born Country
code")
references "Country" ("Country code")
Notice that data types are all char(10) by default. We’ll see how to change this
in the next section. For now, let’s focus on constraints that have a bearing on
indexes. Just as an index to a book enables you to quickly find a topic
of interest, indexes on database columns enable the database system to quickly
access entries for those columns. While indexes speed up access (e.g. using
binary-trees in RAM), they can also slow down updates (because the index also
needs to be updated), so care is required in choosing them. Optimizing
performance via index selection is a large topic (e.g. see chapter 31 of [9]),
especially with the many kinds of indexes available nowadays, so we restrict our
discussion here to the basics.
Like most DBMSs, SQL Server automatically creates unique
indexes on primary keys, since this provides an efficient way to enforce
uniqueness constraints. Moreover, primary key columns are often involved in join
and sort operations, so require efficient access. Since the primary key
declarations in the DDL script automatically cause SQL Server to create indexes
for them, there is no need to explicitly declare indexes on PK columns.
As Figure 1(b) indicates, the CountryName column is mandatory
and unique, and hence provides an alternate key for Country. The DDL script
above includes a declaration to add a unique constraint (Country_AK1_UC1)
to this column. SQL Server automatically creates
unique indexes on columns for which unique constraints are declared, since this
provides an efficient way to enforce the constraint. So there is no need to
explicitly declare a unique index for this column. The above DDL script however,
which was generated from the SR1 release of the modeling tool, redundantly
includes a create-unique-index statement for this column.
This redundant behavior has been fixed for the next release,
but in the meantime you can fix this yourself. To do this, double-click the
Country table scheme to bring up its Database Properties window, select the
Indexes category, open the drop-down list for Index type, and select the
Unique constraint only
option (instead of
Unique index with constraint on top), as shown in
Figure 2. Now save your change, and migrate it back to the ORM source model when
prompted.
If you regenerate the
DDL, you will see that uniqueness on Country.CountryName is enforced by a unique
constraint (Country_UC1)
without an additional unique index clause.

Figure 2 Choosing to implement uniqueness with a declarative unique constraint
Strictly speaking, indexes belong to the physical level.
However, it is possible to specify indexes directly on an ORM source model. This
enables you to control the mapping process up front by annotating the pure
conceptual model with implementation detail. Let’s see how to do this.
Suppose we often want to access personal details without
having to specify the person’s social security number, but rather by entering a
surname to list all employees with that surname. To make this access more
efficient we could declare an index on the role played by Surname in the ORM
model of Figure 1(a). To do this, select the
Person has Surname
predicate, right-click it to bring up its context menu, then select “Add
Constraints…” to bring up the Add Constraint dialog.
Now choose Index from the Constraint type drop-down list, select the right-hand
role, as shown in Figure 3.

Figure 3 Adding an “index constraint” to the role played by Surname
The verbalization indicates that a non-unique index will be created over the role played by Surname. This ensures that when the model is mapped to a relational schema, a non-unique index will be created over the column(s) to which that role maps. Since many people may have the same surname, the index will be non-unique. Hit the OK button to apply the index and exit the dialog. An “index constraint” now appears as a circled “I” on the ORM model, attached to the surname role, as shown at the top of Figure 4. Although stored with the ORM model, this physical annotation is not part of the pure conceptual schema. To assist with our discussion of constraint layers in the next section, where this diagram will be referenced, I’ve added a subset constraint as well. Ignore this for now.

4 An “index constraint” appears as a circled “I” attached to the role(s) it constraints
Another candidate for an index is the role played by Country in the association Person was born in Country. If you often want to inquire about where a person was born, and you want the name of the country, not just the country-code, when you make such a query, then this entails a frequent conceptual join operation between the two roles played by Country. These roles map to separate tables, one role mapping to a foreign key and the other to its referenced primary key. At the relational level, this means we have a need for efficiently joining the tables by matching the FK and PK values. SQL Server does not automatically generate indexes for foreign keys, so if a foreign key is frequently needed for a join you should consider adding an index for it. In this case, you can add an index on the birth-country role by clicking on its fact type and using the Add Constraint dialog (as described earlier for the surname role). The resulting index constraint is shown at the bottom of Figure 4.
One of
ORM’s strengths is its richly expressive constraint notation. When specifying
detailed requirements, it often helps a great deal if we can visualize the
relevant business rules graphically. However, there may also be times when we
wish to ignore this finer level of detail. One way of doing this in ORM is to
use constraint layers to control whether various kinds of constraints are
displayed on screen and/or printed. In Microsoft Visio for Enterprise
Architects, constraint layers are implemented using Visio’s standard layer
properties mechanism.
For example, the ORM model in Figure 4 displays four kinds of
constraints: simple mandatory; internal uniqueness, subset, and index. The
subset constraint (circled “Í”)
indicates that if a person has a second given name then that person also has a
first given name. As described in [5], subset constraints can be added by
selecting their predicates, right-clicking to invoke the Add Constraints dialog,
and making relevant choices. Because of their fundamental nature, simple
mandatory and internal uniqueness constraints are always displayed. But all
other kinds of ORM constraints can have their display suppressed by controlling
the layer properties settings for the diagram. To access the Layer Properties
dialog, choose View > Layer
Properties… from the main menu. The dialog for our
current model is shown Figure 5.
5 Constraint layers allow most kinds of constraints to be hidden
Different kinds of constraint exist on different
layers. In addition to the always-shown constraints (simple mandatory and
internal uniqueness), this model includes constraints on layers 2 and 5. Layer 2
includes disjunctive mandatory (inclusive-or) and set-comparison (subset,
equality, exclusion) constraints. Layer 5 includes index markers (a better term
for index constraints). To suppress display on screen of a constraint layer,
click the check mark in the Visible column in this dialog (this removes the
check mark), then hit the Apply button. If you remove both check marks for
Visible in this example, Figure 4 will be redisplayed as Figure 1(a). The
diagram will still print as Figure 4 unless you uncheck the entries in the Print
column. The column entries are toggles, so you can restore a check mark to a
cell simply by clicking it. ORM models may include other kinds of constraint
that exist on other layers.
With the exception of simple mandatory and internal
uniqueness constraints, ORM constraints are partitioned into five layers, as
shown in Figure 6. Each layer can
be individually controlled. Layer 1 includes external uniqueness constraints.
Layer 2 includes disjunctive-mandatory (inclusive-or) and set comparison
(subset, equality, exclusion) constraints. Since exclusive-or constraints are
simply combinations of inclusive-or and exclusion constraints, these are
included on layer 2. Layer 3 holds value constraints. Layer 4 displays frequency
and ring constraints. Finally, layer 5 is used for indexes. By default, all
constraints are displayed and printed, unless their display/print setting is
unchecked.

6 Five layers of ORM constraints may be suppressed
Data types
Although data types for columns may be
specified at the relational level, it is far better to specify data types
at the ORM level. Why? Firstly, it saves a lot of work, because data types in
ORM correspond to the syntactic domains on which relational attributes
are based. Typically each ORM object type plays many roles, each of which maps
to one or more columns in a relational database. Setting the data type once for
the object type propagates that data type to every attribute mapped from its
roles. Secondly, this avoids type mismatch problems in the generated relational
database (e.g. foreign keys are automatically given the same data type as the
primary keys they reference).
Thirdly, it makes it much easier to
change data types. For example, a database might include hundreds of columns
concerning dates (birthdate, hiredate, orderdate etc.). Suppose you need to
change the data type for each of these columns from, say char(10) where dates
were stored as character strings like ‘2002-07-06’, to a datetime data type that
has built-in support for date arithmetic. At the ORM level, all you need do is
change the data type for the object type Date. At the relational level, you need
to change all the hundreds of date column data types (unless your DBMS properly
supports relational domains, and you have been disciplined in using this
support). Of course, after changing the schema for a populated database you
still have the data migration problem, but having the schema updated so easily
is a major benefit.
By default, each object type in an ORM
model is assigned a data type of char(10). Before mapping an ORM model to a
relational model, the relevant data types should be specified. The tool offers
more than one way to do this. One way is to double-click the relevant object
type on the ORM diagram to bring up its Database Properties Sheet, select the
Data Type category, and then hit Edit button to enter the new data type. For
instance, if we select the CountryName object type in Figure 4, its properties
dialog will appear as shown in Figure 7. Here I’ve set radio button to show the
physical data type for the chosen DBMS, in this case SQL Server. You can
change to a different DBMS by going to the main menu and choosing
Database > Options
> Drivers…
and then selecting from the Database Drivers dialog box.

7 The data type can be set using the Database Properties window
The Data type field in the Database Properties sheet is read-only, so you cannot edit it directly. Instead, hit the Edit button to invoke the Data Type dialog box specific to that DBMS. The Native type and Length fields display the defaults (char and 10). To store country names as variable length character strings, select the relevant item (e.g. varchar) from the Native type drop-down list (scroll down, or quickly type the initial characters until the item appears), as shown in Figure 8(a). Click the mouse to accept the change, then move the cursor to the Length field. To allow for country names of up to 40 characters, change the value from 20 to 40 (see Figure 8(b)), then hit the OK button to accept the change. The Database Properties window now shows the Data type as varchar(40).

(a)
(b)
8 Changing the physical data type to varchar(40)
If you want to edit the data
types for several object types, it’s much quicker to do this via the Object Type
pane of the Business Rules Window. If needed, open this window by selecting
Database > View > Business Rules
from the main menu. Choose the Object Types tab at the bottom to display the
Object Types pane. The data type field supports a drop-down list for selecting
the native type, and allows you to edit the length directly (see Figure 9). In
fact, the whole field is editable, so you can simply type in the desired data
type (e.g. varchar(40)) without accessing the drop-down list at all. The field
is parsed as you go, with basic Intellisense support (e.g. if you type “varc”
this is automatically expanded to “varchar”). If you enter an illegal data type,
the Data Type dialog box discussed earlier is invoked for you to complete the
entry there. Even if your typing skills are only basic, you should find that
this in-situ (in-place) editing facility lets you make data type changes much
faster. Try it now, setting the data type for GivenName and Surname to
varchar(30).
In the Object Types pane, value types
are depicted as broken ellipses, with light-blue fill. Entity types appear as
solid ellipses with dark-blue fill. If an entity type has reference mode, its
data type is the data type of the value type that references it. For example,
the reference scheme Country(code) is an abbreviation for the association
Country is
identified by CountryCode,
so when you set the data type for Country you are actually setting the data type
for CountryCode. One standard way to reference countries is by their 2-letter
ISO codes (e.g. ‘AU’ for Australia, and ‘US’ for the United States). This choice
requires a fixed length character string of 2 characters in length, so we should
edit the data type for Country to become char(2). In the US, social security
numbers are eleven characters in length, requiring a data type of char(11).
Value types often have additional constraints that are not expressible using
built-in physical data types. For example, the characters in country codes must
be letters, and social security numbers must match the pattern ddd-dd-dddd
(where “d” denotes a digit). Currently, the modeling tool does not support the
specification such pattern constraints, but they are trivial to implement in
most DBMSs.

9
Data types can be edited
quickly using the Object Types pane of the Business Rules window
If you already know the DBMS on which your model will be implemented, you will probably prefer to work with physical data types. However, if you haven’t made that choice yet, or you intend to work with many kinds of DBMS then you may prefer to use portable data types. When portable data types are mapped to a given DBMS they are replaced by a corresponding physical data type. To display portable data types on the Data Type pane of the Database Properties window, select the “Show portable data type” radio button. Use the drop-down lists to select the appropriate values. For example, Figure 10 shows portable settings that might be used in place of the physical data type varchar(40). The Text category is used for character strings, the Type is set to Variable Length, the Size is set to Single Byte characters (rather than Double Byte), and the Length is set to 40. The possible settings for portable data types are summarized in Table 1. You can access details about specific settings by using the on-line help.

10 Portable data types may be used instead of physical data types
1 Portable Data types
|
Category |
Type |
Length/Precision (= default) |
Scale (= default) |
Size |
Allow numeric op? |
|
User-defined |
typename |
|
|
|
|
|
Text |
Fixed length Variable length Large length |
Length = 10 Length = 10 -- |
-- -- -- |
Single byte char set Double byte char set |
-- -- -- |
|
Numeric |
Signed integer Unsigned integer Auto counter Floating point Decimal Money |
-- -- Precision = 10 -- Precision = 10 Precision = 10 |
-- -- -- -- = 2 = 2 |
Small, Large Small, Large Small, Large Small, Large Small, Large Small, Large |
Yes/No Yes/No Yes/No Yes/No Yes/No Yes/No |
|
Raw data |
Fixed length Variable length Large length Picture OLE object |
Length = 10 Length = 10 -- -- -- |
-- -- -- -- -- |
-- -- -- -- -- |
-- -- -- -- -- |
|
Temporal |
Auto timestamp Time Date Date & Time |
-- -- -- -- |
-- -- -- -- |
Small, Large Small, Large Small, Large Small, Large |
-- -- -- -- |
|
Logical |
True or False Yes or No |
-- -- |
-- -- |
Small, Large Small, Large |
-- -- |
|
Other |
RowID ObjectID Unknown |
-- -- -- |
-- -- -- |
-- -- -- |
-- -- -- |
Conclusion
This article provided a brief overview of index constraints, constraint layers and data types. The next article will dig a little deeper into data types, and also discuss ways to control the generation of column names in the resulting relational model. If you have any constructive feedback on this article, please e-mail me at: thalpin@attbi.com.
1. Halpin, T. A. 1998 (revised 2001), ‘Object Role Modeling: an overview’, white paper, (online at www.orm.net).
2. Halpin, T.A. 2001a, Information Modeling and relational Databases, Morgan Kaufmann Publishers, San Francisco (www.mkp.com/books_catalog/catalog.asp?ISBN=1-55860-672-6).
3. Halpin, T.A. 2001b, ‘Microsoft’s new database modeling tool: Part 1’, Journal of Conceptual Modeling, June 2001 issue (online at www.InConcept.com and www.orm.net).
4. Halpin, T.A. 2001c, ‘Microsoft’s new database modeling tool: Part 2’, Journal of Conceptual Modeling, August 2001 issue, (online at www.orm.net).
5. Halpin, T.A. 2001d, ‘Microsoft’s new database modeling tool: Part 3’, Journal of Conceptual Modeling, October 2001 issue, (online at www.orm.net).
6. Halpin, T.A. 2002a, ‘Microsoft’s new database modeling tool: Part 4’ (online at www.orm.net). This is a revised version of an earlier article of the same title in the January 2002 issue of Journal of Conceptual Modeling.
7. Halpin, T.A. 2002b, ‘Microsoft’s new database modeling tool: Part 5’ (online at www.orm.net). This is a revised version of an earlier article of the same title in the March 2002 issue of Journal of Conceptual Modeling.
8. Halpin, T.A. 2002c, ‘Microsoft’s new database modeling tool: Part 6’ (online at www.orm.net). This is a revised version of an earlier article of the same title in the May 2002 issue of Journal of Conceptual Modeling.
9. Rankins, R. et al. 2002, Microsoft SQL Server 2000 Unleashed, Sams Publishing.
Note: Revised versions of many of the above references are also accessible online from the MSDN library (http://msdn.microsoft.com/library/default.asp). From the tree browser on the MSDN Library Home Page choose the following path to find these articles: Visual Tools and Languages > Visual Studio .NET > Visual Studio .NET (General) > Technical Articles.
![]()
Dr Terry Halpin, BSc, DipEd, BA, MLitStud, PhD, is a Program Manager in Database Modeling for the Enterprise Frameworks and Tools Unit, Microsoft Corporation, USA., Seattle WA, USA. During a lengthy career as an academic in computer science, he also worked in industry on database modeling technology and as a data modeling consultant. His recent positions include head of database research at Asymetrix Corporation, and research director of InfoModelers Inc., which was acquired by Visio Corporation. For several years, his research has focused on conceptual modeling and conceptual query technology for information systems, using a business rules approach. Dr Halpin has presented papers and tutorials at many international conferences. His doctoral thesis provided the first full formalization of Object-Role Modeling (ORM/NIAM), and his publications include over ninety technical papers, as well as four books, including Information Modeling and Relational Databases (Morgan Kaufmann, 2001).
Contact Information:
Dr Terry Halpin
Program Manager, Database Modeling
Enterprise Framework & Tools Unit, Microsoft Corporation
One Microsoft Way
Redmond WA 98052-6399 (USA)
terryha@microsoft.com
(425) 705 9190
fax: (425) 936 7329
http://www.orm.net
![]()
© Copyright, 1998-2004 InConcept (Information Conceptual Modeling, Inc.) All Rights Reserved. Privacy Statement. ISSN: 1533-3825