April
1998
Issue: 1
Journal of Conceptual Modeling
www.inconcept.com/jcm
Common
Model Fragments: People and Organizations
by Scot A.
Becker
Introduction
Welcome to the first installment of Common Model Fragments!
A good portion of data modeling can be pretty routine. Often, organizations tend to track the same things: people, organizations, hierarchies, contacts, products, invoices, etc. Now I won't presume to make the best model for all situations. (Heh actually, I won't presume to even have them totally correct!) I will try to address some generic issues, however, and provide you with a starting point.
These model fragments have been approached a number of times by various sources often using Entity-Relationship (ER) modeling. The main goal behind this column is to look at various common model fragments in Object-Role Modeling (ORM) (since we all know ORM is vastly superior to ER, right? <s>). The other goal here is to teach by example (or, as I prefer to call it, a baptism-by-fire approach). Folks familiar with ER but not ORM should be able to follow along pretty easily, and maybe pick up some ORM concepts along the way. Likewise, since ORM notation is fairly easy to understand, novices to data modeling should learn a lot as well. Hopefully, after a few months of this column, we should have a pretty good library of model fragments going! If you have any suggestions for other common model fragments, please let me know.
Getting Started
Ok, lets get started. Our first model fragment is people and organizations. A couple of notes are needed here before we begin, as there are certain common aspects that are ignored by this model. The first is hierarchies. While I will model business entities and people, I will not represent them in a who-reports-to-who hierarchy - yet. I think we (InConcept) have a pretty good way of doing this. However, that is really an article of it's own, so stay tuned. I'm also going to ignore some common attributes of this model such as international phone/address issues, etc. I'm trying to show a general structure here, adding extra columns should be easy. Plus, -- I've always wanted to write this -- I want to leave that as a reader exercise!
Let's start with the most basic question: Who (or what) are we tracking? If you said "People and Organizations", you are sarcastic, but nonetheless correct. Obviously, we will be tracking information about people such as name, gender, etc. and information about business entities such as name and tax ID. However, that can be done quite effectively by a spreadsheet. Why do we need the database? We need the database because we want to do more - we want to track relationships.
Who, or What, are You?
Let's think about the basic objects we'll need such as person, organization and employee. You'll note that I included both person and employee in that list. Now, in addition to my suggestion that employees are not really people, I've also not-so-cleverly implied the used of subtypes. Let's make one major supertype, Party, and subtype from there. Now, to cheat a bit, we assume we have no legacy data concerns and say that we use the same unique identifier (Party ID) for all subtypes. Not very realistic, I know, but it suits our purposes here. (Besides, I deal with the headaches numerous identifiers for the same thing on a daily basis, let me write this in a utopian dream world: humor me).
The subtypes of Party are Organization and Person. We could further subtype Organization into Internal and External Organizations, but I'll ignore that here. And, contrary to my previous Dilbert-esque statement, I'll subtype Person into Employee and Non-Employee.
Now, let's start attaching functional roles to the Party object. We'll say that Party has Party Type, which is an identifier for additional designations we want to store such as subsidiary, shareholder, department, government owned institute, etc. We'll also add in some temporal information to track past and present parties. We do this by including the ' has begin Date' and ' has end Date' roles. Note that begin date would be mandatory while and end date is not, or, in other words, that lack of an end date implies a 'current' party. Since this sort of information comes up a lot in this model, I will use the term 'temporal information' to mean these two roles. The one last functional role we will attach is one that is absolutely mandatory for subtypes: the role we use to distinguish between which subtype to use. In this case, we add the functional role " has Party Class". The party class object is used to designate subtypes and has the fixed range of values: "Organization", "Employee", and "Non-Employee". I'm cheating a bit here, because we really should include a definition as to what is a Person subtype. I'm implying the supertype Person by the designation of its subtype, Employee or Non-Employee.
Now, we add some functional roles to the subtypes. We say that every Organization must have an Organization Name, and that an Organization can have a Federal Tax ID. We then add the usual roles to the Person Object, in this case: first name, last name, middle initial, and suffix (such as Junior, Senior, Esquire, The Great, etc.). Then, we do add some functional roles to the Person subtypes in the form of " has SSN", " was born on Date", and " has Gender". We then add the functional role of "Non-employee has Prefix" (such as Mr. Mrs., etc.).
That's all of the information we'll store about parties for now. Obviously, it is not complete, but we have the base built. Please see Figure One for the ORM schema fragment. If you are unfamiliar with ORM notation, please see "A Quick Explanation of the ORM Symbols Used", below.

Figure One -- Parties
Coping With Relationships
As I said, we want to track relationships between parties. Also, in many cases, any given two parties can have more than one relationship between them. For example, Bob may sell a Amway products to Suzy as a side business (Vendor/Customer relationship), but during the day, Suzy actually works for Bob (Employee/Employer relationship) - this, by the way, should explain why Suzy is buying Amway products from Bob, but I digress. So, in order to model this, we start with the basic fact "Party has a relationship with Party of type Relationship Description'. The relationship description part of this fact takes care of the instances when two people have more than one relationship that we want to track. This can be done other ways, but I took this approach to simplify the final schema. (I really jumped ahead of a beginner's ORM model. If you are interested in the transformations, Dr. terry Halpin does a great chapter on schema transformations in his book, referenced in the "A Quick Explanation of the ORM Symbols Used".) Now, we want to primarily talk about the relationship, so let's nest that predicate into an object called, appropriately enough, "Relationship". We then attach the usual temporal information to the relationship (nested) Object. Now, we add the functional roles " has Relationship Status" and " has Priority" to the Relationship Object. Possible Relationship Status' include "active", "inactive", "pursuing", "sucking up", etc. while the Priority object indicates the importance of the relationship to the enterprise (as opposed to, for instance, the importance of the Vendor relationship between Bob and Suzy is to Suzy's career). You'll note that I haven't modeled anything about contacts between Parties. Don't get a head of me, I'll do that in a bit . I've shown the ORM schema fragment for Relationships in Figure Two.

Figure Two -- Relationships
Getting There From Here
While you digest the concept of a nested ternary, let's do something a bit more common (and easier) to modelers across the globe: modeling locations. So, we start with the fact "Party has Location". Location is really just a descriptor such as "home", "branch office", "life-force draining cubicle", etc. We can also use this to note addresses for specific things within an organization such as "headquarters", "billing address", "warehouse", etc. Because we seem to have fun with this, let's nest that role into the creatively named Object "PartyLocation" and add our usual temporal information to the nested object. Now, let's add the functional role "PartyLocation has Address". Attached to the Address object are the familiar roles for address lines, city, state, and postal code. I then added an external uniqueness constraint across all of those fields, as we wouldn't want any duplicates. I have the ORM schema fragment for addresses shown in Figure Three

Figure Three -- Addresses
Call Me Sometime
Addresses were too easy. Let's add some more symbols to make this appear difficult. We know who people and things are and we know where they are. What's next, you may ask? How about: 'How do we pester them?' Let's begin this discussion with a more abstract object. I'll denote a supertype named "Contact Means" and attach the functional roles " has Means Description" and " has Means Type" to it. The Means Description object is for things like "home phone", "work phone", "work e-mail", "fax", "pager", "cellular", etc. The Means Type object is really used to classify Means Description into one of two possible values, "Telecom" or "E-Mail". This classification is then used to define the subtypes of Contact Means, namely, "Telecom" and "E-Mail". Then, to enforce that every Means Description has a Means Type, we make sure both roles are mandatory.
This may seem a bit strange to you, but I have a "reason for my madness". As an example, one may model this a bit differently in that I could have ignored the role " has Means Type" and defined the subtypes of Contact Means solely by their Means Description. Especially since, on the surface, classifying a Means Description into Means Type seems a bit redundant, right? However, doing this adds a fatal flaw: it is only stable as long as the possible values of Means Description are stable. If we add a new technology such as the Shoe Phone used by Secret Agent Maxwell Smart, we would have to not only add "Shoe Phone" to our list of Means Description, but we would also have to change our subtype definition to include "Shoe Phone" as a Telecom entity. Implementing it the first way, we only add the "Shoe Phone" to the list of Means Descriptions (in the resulting table) and then make sure to note that it's Means Type is "Telecom".
OK, now that we've gotten past that, let's add some functional roles to the subtypes. In particular, I will add the facts: "Telecom has Phone Number" and "E-Mail has E-Mail Address".
I now want to add another abstract object: "Contact Mechanism". We give Contact Mechanism it's own ID, and say that a Contact Mechanism " is for Party" or " is for PartyLocation". You remember PartyLocation, don't you? It was that nested object type we used for defining addresses. Again, this helps us have more than one phone number or e-mail address for the same organization at the same location such as "technical support", "sales", etc. Such descriptions are stored in the "Contact Mechanism has Contact Mechanism Role" fact. Also note that we had an "Or situation": A contact Mechanism is for a Party or a PartyLocation, not both. We denote this with an exclusionary constraint, which keeps us from populating both roles with the same "Contact Mechanism" and a mandatory disjunction, which means at least one of the two roles has to be populated (this is signified by the same mandatory dot covering both roles). The combination of these two constraints means that exactly one of these roles must be populated for each Contact Mechanism. Then, we just have to connect the Contact Means object with the Contact Mechanism object via the " has ." role.
As a side note, you'll note I nested that last role and added our usual temporal information to it. This is a common practice when one wants to track the history of data in a database. In this case, we are being a bit silly in tracking the changes of a phone number for someone's tech support line, but it illustrates a good example.
The ORM schema fragment for Contact Methods is shown in Figure Four.

Figure Four - Contact Methods
Have My People Contact Your People
We have one schema fragment left to model which is the actual tracking of contacts between parties. Let's start with the fact: "Party contacts Party as Contact Relationship". This does sound redundant with our earlier definition of a relationship, doesn't it? The difference, albeit subtle, is necessary. In our definition of a relationship, we noted how a party relates to a party. A contact relationship denotes more of why a Party contacted another rather than the relationships between parties. Further, while Bob and Suzy have a vendor/customer relationship, someone else (on Bob's behalf) may contact Suzy about the particulars of that relationship. (For example, Bob's assistant could call Suzy and remind her that her annual performance revue is coming up and oh, by the way, isn't she almost out of Amway brand laundry detergent?) Similar to our definition of relationships, we allow 2 parties to contact each other under different contact relationships as well. (For example, Suzy calls Bob (the salesman) to buy more detergent in one instance, and later calls Bob (the boss) again to get permission on a company purchase order.) Since nesting seems to be the thing to do today, we'll nest this ternary fact into an object named "Contact".
We now attach a role between the nested object type, Contact, and the specifics of the contact, conceptualized as a Contact Note (with an internal ID). Please note the uniqueness constraint over the "Contact has Contact Note" fact; there may be more than one instance of Contact Note for each Contact (i.e. more than one phone call between Suzy and Bob's assistant calling on Bob's behalf, because perhaps Suzy's check bounced). We now add some functional roles to the Contact Note object to track the status of a contact (such as "pending", "in progress", "completed", "scheduled", etc.), the date (and time, if you'd like) of the contact, specific notes about the contact (Note Text) and the contact Type (such as "support call", "follow up", "extortion", etc.) The ORM schema fragment for Contacts is shown in Figure Five.

Figure Five -- Contacts
Other Issues, Disclaimers, and Excuses
It is important to note that the conceptual schema I provided above (and, for that matter, schemas to come in later issues) are not quite ready for "production". I wanted to focus on the conceptual side of things than rely on common implementation issues.
One such issue is the use of generated primary keys. Conceptual modeling ignores the use of generated primary keys (such as MS SQL Server's Identity Column) unless you remember to include some sort of Identifier in the schema. For the sake of brevity and ease of discussion, I eliminated many such identifiers, but they should not be ignored in actual use. Common places in the model where such identifiers would be included are in the use of look up tables for descriptions, types, and status codes and in places where compound uniqueness constraints determine the primary key of the actual table such as the PartyLocation nested object type.
Another common issue is the addition of more temporal information. By temporal data, I am referring to the common use of marking entries with start and end dates. I have included this is some places (Party, PartyLocation, etc.), but only as a means to conceptually reinforce the fact that some entries that may appear to be duplicates are, in fact, not. For example, A PartyLocation may change several times during its life in the system. Without start and end dates, the system would be cumbersome and appear to contain redundant and/or confusing data.
Another issue that will probably need to be addressed is how "closed" you want the system to be. (When I am using the tern "Closure" I am pretty much stealing Dr. Halpin's definition). In many cases, you will probably want to relax many of the mandatory roles I have indicated in the sample schema. Obviously (and unfortunately, in many cases) you don't always have all of the information you want to store. Thus, you will have to relax your closure a bit. In some cases, you will want to enforce other roles as being mandatory (for example, maybe the Federal Tax ID is mandatory information to store about all Organizations in your system).
A more nit picky point is my occasional use of value ranges and possible value lists. In some cases, they are complete (for example, unless you plan on tracking hermaphrodites, your possible system values for gender are complete with "M" and "F"). In others, they may not be (for example, maybe you want to also include the possible suffix's of "The Great", "Prodigy", or "COBOL Warrior" to a person's name).
Now, many people may chastise me (and rightly so, I may add) for not using example data sets in my predicates. Doing this often catches many conceptual errors. Frankly, I ran out of time this issue since I am also creating the newsletter as a whole. Due to this, it is quite possible I screwed something up. For illustration sake, I almost hope I did. I will be a good example of how you should always include sample data. Any error(s) I (or others) may find will be addressed next month before I start the next model fragment. I apologize in advance.
And lastly, you will also want to add much more information to track in your system. The model as shown here, while conceptually interesting, is not practically interesting without more useful information being stored in the system.
Conclusion
I hope this has been an interesting example for you to see. If you have any feedback, corrections, or suggestions, feel free to mention them in the forum or via e-mail. Also, if you have any ideas for future sample model fragments, let me know. Next month, I think I am going to tackle products.
A Quick Explanation of the ORM Symbols Used
For those of you not entirely familiar with ORM, I'll include a brief legend below. This is meant only as a very quick explanation of what the various symbols mean, it is not a good way to learn ORM.
Objects are shown as ellipses. The object name is denoted inside of the ellipse. Sometimes, an additional name will be included in parenthesis below the object name. This is what we call the reference mode, and is used to identify an object. For example, the Party object has a reference mode of Party ID. Reference modes may only be shown on entity objects (denoted with a solid ellipse). Entity objects are conceptual things like 'Party' or 'Person'. The reference mode of an entity object is what is used to identify instances that conceptual thing. Value objects (shown as a dashed ellipse) are merely values (instances of data) such as a string, a date, or a number.
Predicates, or the connection between objects, are shown as boxes connected to the objects with their predicate text (such as ' has ') denoted (usually) below the boxes. The dots shown on the connection between an object and a predicate mean that that role is mandatory (for example, every Party must have a start date. The double tipped arrows (or bars) above the predicate boxes show ("internal") uniqueness. In particular, if we had sample data shown in the predicate boxes, the bar would cover the unique column (or combination of columns). For example, for the role "Party has Party Class", we could include the sample data pairs for (Party ID, Party Class): (1, "Employee"), (2, "Organization"), (3, "Organization"), and (4, "Non-Employee"). You'll note that Party ID is unique while Party Class is not. Thus, we say that, "every Party has exactly one Party Type". You'll note that the Role, "Organization has Federal Tax ID" has two bars. This means that both the Organization (Party ID) and Federal Tax ID (a number) must be unique. We do the same for the "Employee has SSN role". Later on, you'll see roles where one bar spans more than one column. This means that the combination of columns must be unique (for example, each contact may have more than one Contact Note). Note that while most predicates are binary (involve two objects) some are ternary (involve three objects) or higher.
Once in a while, you will see a "P" denoted on the uniqueness bar, this is an InfoModeler notation used when mapping the tables and primary keys. I've ignored those aspects for now.
Sometimes, you'll see a border drawn around a predicate box. This is known as nesting. What we are really doing is treating the relationship between two objects as an object in itself. The name of this nested object is noted near the nested predicate. For example, we nest the relationship "Party has a relationship with Party of type Relationship Description" so that we can attach functional roles to that "Relationship". This is also a common technique when storing temporal information.
You'll occasionally see a carat shape in the top center of an object; this only means the object has appeared elsewhere in the model. For example, the Party and Date objects are used frequently throughout this model.
Values shown in curly brackets denote a range of acceptable values such as "M" or "F" for Gender.
Bold arrows pointing between objects denote supertypes and subtypes. The arrow always points to the supertype. Subtypes must have a definition based on another fact attached to the supertype. The actual definition is not shown on the diagram, however, but you will usually see an object with possible value ranges that match the names of the subtypes. For example, to define the Party subtypes Organization, Non-Employee, and Employee, we use the fact type "Party has Party Class with its possible values of "Organization, Employee, or Non-employee".
"External" constraints are shown as circles attached via lines to more than one predicate. For example, when modeling an address, we will attach an external uniqueness constraint (denoted by a "U") attached to the address line, postal code, state, and city roles. In another case, we will use a similar notation to denote an "exclusionary" constraint (denoted this time by a circled "X"). This means the object my play only one of the roles, not more than one of them.
Please bear in mind, the preceding was a very quick explanation of the ORM symbols used, but not a complete reference. Please see Dr. Terry Halpin's definitive book on the subject titled, "Conceptual Schema & Relational Database Design, Second Edition", for further information. I'll explain any new symbols as I use them later on.
![]()
Scot A. Becker is a software consultant and the founder of Orthogonal Software Corporation. He is also a certified ORM consultant and trainer, a certified Visio trainer, and former Editor of the Journal of Conceptual Modeling.
Contact Information:
Scot A. Becker
Orthogonal Software Corporation
scot@orthogonalsoftware.com
www.orthogonalsoftware.com
![]()
© Copyright, 1998-2004 InConcept
(Information Conceptual Modeling, Inc.) All
Rights Reserved. Privacy Statement.
ISSN: 1533-3825