October 2000              Issue: 16

Journal of Conceptual Modeling
www.inconcept.com/jcm

An ORM Metamodel
By Dr. Terry Halpin

 

[Editor's Note: For an ongoing discussion of the development of an ORM metamodel, see the InConcept/JCM Metamodel project.]

Abstract

This article provides a basic metamodel for Object Role Modeling (ORM). It is presented to clarify the fundamental concepts in ORM, to help those interested in mapping between ORM and other modeling approaches such as ER and UML, and to contribute towards an eventual standardization of ORM. The metamodel is restricted to ORM conceptual schemas, ignoring conceptual modeling processes (e.g. the ORM conceptual schema design procedure), mapping procedures (e.g. mapping between ORM and relational models) and ORM textual languages (e.g. ConQuer, FORML or RIDL). There are many ways to metamodel ORM, and the model discussed here is simply one suggestion. If you have any suggestions for improvement, please email me at TerryHa@Microsoft.com, express your opinions to the JCM discussion group, or consider submitting your own metanodel to the JCM.

The two main constructs in ORM are object types and relationships. One way of classifying them is shown in Figure 1.

Figure 1

Value types and entity types are sometimes called lexical and non-lexical object types respectively. Values (e.g. character strings or numbers) are constants that have a standard denotation and hence require no reference scheme. Entities (e.g. people, countries) require a reference scheme that enables them to be identified in human communication by relating them to one or more values (e.g. the Country that has CountryCode 'US'). Simple reference schemes relate an entity to a single value, and may be abbreviated by enclosing a reference mode in parentheses, e.g. Country(code). The reference mode is the mode or manner in which the value refers to the entity (e.g. a code rather than a name or number). With our country example, the relationship type is Country has CountryCode. In the metamodel we use "Relationship" to mean relationship type, which includes both the object types and a logical predicate (e.g. "… has …"). For binary infix predicates, the placeholders for object terms are normally omitted, though they may be shown explicitly using an ellipsis "…" if desired. Country is an entity type, CountryCode is a value type, has is a predicate and code is a reference mode. Since reference modes are merely a convenient abbreviation, and automatic translation between reference modes and values types is possible, they add no expressibility to ORM, and could be omitted from a core metamodel. An object type is independent if its instances can exist independently of playing any role in a fact type.

The most debatable aspect of the model in Figure 1 is the treatment of nesting. ORM allows a relationship instance to be objectified as an entity instance. For example: the Person with SSN 12345 was promoted to the Position named 'Senior Modeler'; that Promotion occurred in the Year 2002 AD. Here the relationship "Person was promoted to Position" has been objectified as the entity type "Promotion". At least for design, if not for analysis, we could treat the entity type Promotion as ontologically distinct from the relationship that gave rise to it. This is the approach adopted in Figure 1, where the object type NestedEntityType is mutually exclusive with Relationship. Here the nested entity type nests a relationship within it, but is not identical to that relationship. This approach is visually consistent with ORM tools where one click selects the nested entity type for editing and a second click is required to select the relationship subshape within it. It also allows the metamodel to be directly implemented in languages such as C# and Java that do not support multiple inheritance.

However if you feel this approach is unnatural, you may prefer to identify the nested entity type with the relationship that it objectifies, at least for analysis. This can be done using the alternative model fragment shown in Figure 2, where ObjectifiedRelationship replaces NestedEntityType in the previous model. Only the main differences from the previous model are sketched here, and subtype definitions are omitted. It is possible to choose values for TypeKind to make it functional instead of m:n.

Figure 2 An alternative approach to nesting using multiple inheritance

Now consider Figure 3. This indicates how object types, predicates and roles are named. Text boxes include additional constraints and derivation rules. Each object type has exactly one name, which is unique throughout the schema. In Figure 1, an object type is referenced by its name, and the association ObjectType has ObjectTypeName is simply the explicit version of the reference scheme depicted as ObjectType(name). If we choose to identify an object type by a number however, as in Figure 2, then the association ObjectType has ObjectTypeName provides an alternative reference scheme.

A role is a part played in a relationship. The arity of a relationship equals its number of roles, and hence can be derived (see derivation rule D1). Each role is identified by a number. In addition, a role is assigned a position according to its order in the first verbalization of the relationship in which it occurs. This order remains even if the first verbalization is deleted; if the arity is decreased, roles are dropped from the high order end; if the arity is increased, roles are added to the high order end. Hence role order is stable.

Each relationship is identified by a number. Each relationship also has one or more relationship readings, each of which forms an alternative identifier for the relationship. A relationship reading is derived by inserting the relevant object type names into a predicate using derivation rule D2. Each role may start a predicate (which is ordered, as in logic). The relationship itself is considered to be unordered, although its readings are ordered (as in natural language sentences). This approach allows a relationship with n roles to have at least one and at most n readings (n ³ 1).

Some versions of ORM are binary only (each relationship has 2 roles): this makes implementation much easier for tool vendors, but makes it harder for domain experts and modelers to formulate and validate models, since many facts which are most naturally conceived of using unaries, ternaries or higher arity relationships must be recast using binaries (often very awkwardly and artificially). Any such restriction on arity can be added as an additional constraint to the metamodel.

The approach to relationship readings shown here should work with any version of ORM, and is designed to support queries using relationship paths that can start at any role, as in ActiveQuery. In principle, an n-ary relationship may have n! (factorial n) readings, but only n are needed to allow navigation from any role, so the metamodel does not allow more than n. The metamodel also determines the role order within relationships, although it would be possible to allow the user to specify this instead (but the added complexity for such extra flexibility does not seem worth the price).

Figure 3 Naming of object types, predicates and roles

Although diagramming conventions are not part of the metamodel, the following pragmatic suggestions are made. Allow the display of both forward and inverse predicate readings on the diagram: these may be displayed together, separated by a slash "/" or written in or beside their starting role. For n-ary relationships (n > 2), display at most one end-reading on the diagram (an end-reading is typically the first reading given, and must be ordered from one end role to the other end). Other readings for an n-ary association may be displayed on a properties sheet. For unaries or binaries, each reading is an end-reading. Rule R2 requires each relationship to have at least one end-reading.

In addition to predicate names, some versions of ORM allow a rolename to be supplied to help with automatic generation of attribute names at mapping time, or to allow attribute-style rules. This is similar to association-end names in UML. For example, Figure 3 includes the rolename "player" in square brackets besides its role. It is a tool's choice whether to allow the display of rolenames on the diagram. Some ORM tools show them only on a properties sheet. Ideally users should be able to toggle on/off rolename display on the diagram. Unlike UML, ORM requires that each association must have at least one reading displayed on the diagram. To avoid clutter, the display of role names on the diagram would often be suppressed.

Within the scope of a single relationship, role names are unique. Globally however, the same rolename may appear in different relationships. Roles in the same relationship are coroles of one another. A role is a far role of an object type if and only if it has a corole that is played by that object type. To ensure that role path specifications are unambiguous, we require that for any given object type, the names of its far roles must be distinct (rule R1). For example, in Figure 4 the far roles of Employee are shaded. Each has a different role name. The role name "manager" appears for different roles, but these are not far roles of the same object type, so expressions such as Employee.manager and Project.manager are deterministic.

As an aside, some discipline for role name choice needs to be put in place regarding singular or plural. Here singular role names are used, which is fine for mapping to pure relational schemas. However if role paths are to be used for specifying attribute-style rules, plural names may make more sense for multi-valued attributes (e.g. Employee.reports rather than Employee.report). One problem with assigning significance to singularity/plurality on rolenames is that it leads to instability (e.g. a change in a relationships uniqueness constraint pattern may force a change to the role names). Relationship readings don't have this problem.

Figure 4 Far roles of Employee (shaded) must have distinct names

Rule R3 requires that roles starting the same PredicateText for the same Relationship must be symmetric. One could strengthen this to forbid two roles in the same relationship to start the same predicate text. This is useful when there really are naturally different underlying roles (e.g. it's much better to use "is husband of/ is wife of " than "is married to/is married to"). However in some cases it is unnatural to force such a distinction. Consider Country adjoins/adjoins Country. We could make this asymmetric by using alphabetic ordering (e.g. Country pre_adjoins/post_adjoins Country), but is this desirable at the analysis level? If we use a union operation to derive the symmetric relationship from the asymmetric relationship, we still need to cater for symmetry in discussing the derived relationship. A full discussion of this issue is postponed for a later article.

A basic metamodel of ORM constraints is shown in Figure 5. The subtype definitions and legend explain most aspects. Detailed explanations of the constraints are accessible from papers on www.orm.net. The constraint list excludes some rare, and as yet unimplemented, constraints such as extensional uniqueness, relative closure and object type cardinality. Subtype exclusion and exhaustion constraints are derivable from formal subtype definitions and other constraints, and are omitted here. Value constraints currently may apply only to value types (explicitly or by implication from simply identified entity types), but in principle this restriction could be relaxed. The following derived association is omitted from the figure: PrimaryUC identifies EntityType.

Figure 5 Metamodel of ORM constraints

The term "ArgLength" refers to the number of roles in each argument at the end of a set-comparison constraint (subset, exclusion or equality). As a simple example, consider the schema in Figure 6, where constraints are numbered C1..C4 for reference.

Figure 6

Using the constraints metamodel in Figure 5, the four ORM constraints may be stored as the object-relation shown in Table 1. The subset and exclusion constraints have their argument length recorded. The actual arguments of these two constraints may now be derived by "dividing" the role lists by this number. Thus the arguments of the subset constraint are the simple roles r4 and r2, whereas the arguments of the exclusion constraint are the role pairs (r1, r2) and (r3, r4). The constraint type may now be used to determine the appropriate semantics.

Table 1 Meta-table for storing ORM constraints

constraintNr

constraintType

roles

argLength

C1

C2

C3

C4

UI

UI

SS

X

r1, r2

r4

r4, r2

r1, r2, r3, r4

 

 

1

2

That's all I have time for in this article. The ORM metamodel discussion here should in no way be considered definitive, nor does it correspond exactly to the ORM metamodel currently implemented in Microsoft's ORM tools. The main aim of the article is to provide a first suggestion towards obtaining an agreed ORM metamodel for public use. Constructive comments to improve the metamodel are encouraged.

Dr Terry Halpin, BSc, DipEd, BA, MLitStud, PhD, is a Program Manager in Database Modeling for the Enterprise Frameworks and Tools Unit, Microsoft Corporation, USA., Seattle WA, USA. During a lengthy career as an academic in computer science, he also worked in industry on database modeling technology and as a data modeling consultant. His recent positions include head of database research at Asymetrix Corporation, and research director of InfoModelers Inc., which was acquired by Visio Corporation. For several years, his research has focused on conceptual modeling and conceptual query technology for information systems, using a business rules approach. Dr Halpin has presented papers and tutorials at many international conferences. His doctoral thesis provided the first full formalization of Object-Role Modeling (ORM/NIAM), and his publications include over ninety technical papers, as well as four books, including Information Modeling and Relational Databases (Morgan Kaufmann, 2001).

Contact Information:

Dr Terry Halpin                
Program Manager, Database Modeling   
Enterprise Framework & Tools Unit, Microsoft Corporation                
One Microsoft Way
Redmond WA 98052-6399 (USA)
terryha@microsoft.com
(425) 705 9190
fax: (425) 936 7329
http://www.orm.net

© Copyright, 1998-2004 InConcept (Information Conceptual Modeling, Inc.) All Rights Reserved. Privacy Statement.
ISSN: 1533-3825