June 1999                   Issue: 9

Journal of Conceptual Modeling
www.inconcept.com/jcm

Everything You Ever Wanted to Know About "city"
By Dr. John K. Sharp

Several JCM discussions have brought up issues dealing with the precision and clarity of modeling using ORM/NIAM/NLM. Most of these have centered on the need to assist others in understanding the value of mode precise modeling. This is a continuing challenge for most of us. Information Modeling has been treated as a way to describe the high level or general requirements. Any approach that requires or even allows for a more detailed and precise definition of requirements is suspect due to the general low-value that is associated with design as contrasted to the high value that is associated with coding.

In engineering disciplines, detailed and precise requirements are required from the start. Information models routinely have a high level design that is widely discussed and "reviewed" (I use this word in quotes because most of the reviews that I have seen are no more than a popularity contest for the principal model whether or not the model contents accurately reflect the subject area being modeled.). The "details" are left for the coding of the system. In a lot of applications these details are based on the experience or perception of the coder or they are based on sporadic contacts with the analyst and/or subject matter experts. Eventually the "details" are needed for the application to work, but information technologies have not been successful enough at turning requirements into parts.

One issue that has been discussed in our JCM e-mails is the ability to precisely handle the concept of City. This article will discuss all of the possible issues dealing with the management of information dealing with "City." To start this analysis two instance based sentences have been created.

Person identified by 231-45-4312 is assigned Springfield, MO.
Person identified by 423-11-2786 is assigned Boston, MA.

Three placeholders have been located. The placeholders are now qualified.

The elements (231-45-4312 and 423-11-2786), (Springfield and Boston), and (MO and MA) refer to members of which objects/classes?

Person for 231-45-4312 and 423-11-2786
City for Springfield and Boston
State for MO and MA

The element strings (231-45-4312 and 423-11-2786), (Springfield and Boston), and (MO and MA) go by which labels?

Social Security Number for 231-45-4312 and 423-11-2786
City Name for Springfield and Boston
State Abbreviation for MO and MA

What would you like to call the placeholders for the positions where (231-45-4312 and 423-11-2786), (Springfield and Boston), and (MO and MA) appear in the sentence?

SSN for 231-45-4312 and 423-11-2786
CityName for Springfield and Boston
StateAbb for MO and MA

The classes referred to by the three instance sets are now checked to see that they are properly identified. Identification fact types (IDFT) for each class/object are established.

Person Class:

Person with social security number 231-45-4312 exists.

Q1 Matrix

Person with social security number <SSN> exists.

231-45-4312

 

----------------

Allowed?

another

Yes

Q2

Does the social security number 231-45-4312 at any moment in time exactly identify a person?

Yes

These two "Yes" answers establish:

 IDFT 1: Social security number <SSN> identifies a person.

City Class:

City with city name Springfield exists.

Q1 Matrix

City with city name <CityName> exists.

Springfield

 

--------------

Allowed?

another

Yes

Q2

Does the city name Springfield at any moment in time exactly identify a city?

No

Q3

Does an identifier of City contain the placeholder under investigation and at least one other placeholder?

Yes

What is the object of this new placeholder?

State

City with city name Springfield exists in state with the state abbreviation MO.

City with city name <CityName> exists in state with the state abbreviation <StateAbb>.

Springfield

MO

 

-------------

---------

Allowed?

another

MO

Yes

Springfield

another

Yes

Q2

Does the city name Springfield and state abbreviation MO at any moment in time exactly identify a city?

Yes

IDFT 2: City with city name <CityName> in state with the state abbreviation <StateAbb> identifies a city.

There is also the need to assign domain constraints (e.g. text, small integer, date, etc.) to each placeholder. For the City Name placeholder a self-referencing identification fact type should be automatically created when IDFT 2 is created.

IDFT 3: The city name <CityName> identifies a city name.

The sentence is self-referencing because the object and the label are identical.

State Class:

State with state abbreviation MO exists.

Q1 Matrix

State with state abbreviation <StateAbb> exists.

MO

 

---------

Allowed?

another

Yes

Q2

Does the state abbreviation MO at any moment in time exactly identify a state?

Yes

IDFT 4: State abbreviation <StateAbb> identifies a state.

These four identification fact types can participate in managing knowledge about the original sentence. All three objects in the original sentence have been identified. The complete sentence is now analyzed.

Q1 Matrix

Person identified by <SSN> is assigned <CityName>, <StateAbb>.

231-45-4312

Springfield

MO

 

---------------

--------------

---------

Allowed?

another

Springfield

MO

?

231-45-4312

another

MO

?

231-45-4312

Springfield

another

?

EFT 1: Person identified by <SSN> is assigned <CityName>, <StateAbb>.

If the concept of "City" is nominalized, then there are four potential outcomes for the answer vector in the Q1 Matrix. Figure 1 presents the answer vector and the associated relational table. In the following figures the answers to the Q1 Matrix are presented to the left of the corresponding relational model.

sharp1.gif (4015 bytes)

Figure 1: Relational tables for the Q1 answer vector when "City" is nominalized.

Mathematically there are four more combinations of "Yes" and "No" answers when three placeholders are analyzed. Two more elementary fact types are established when these combinations are evaluated:

EFT 2: Person identified by <SSN> is assigned to <StateAbb>.

EFT 3: Person identified by <SSN> is assigned to the <CityName> city name.

The state object in EFT 2 can be populated and managed independently of the city object in EFT 1. This allows for the defining independent processes for populating EFT 2 and EFT 1. One process would include the assignment of a person to a state and the other process would include the assignment of a person to a city in the previously assigned state. If the "City" object was initially nominalized this option for potential process improvement would not be available. Figure 2 presents the relational model that corresponds to the given Q1 Matrix answer vector.

sharp2.gif (3700 bytes)

Figure 2: Restrictions on state assignments that limits the assignments of cities.

In addition to restrictions on the assignment of states another restriction could be proposed to restrict the assignment of cities based on restrictions on the assignment of city names to a person. This is more of a leap from reality for modeling experts, but the interesting issues is the completeness of the logic that allows all "Yes" and "No" answer patterns to have real world meaning. The pattern is important for analysts to understand, even if all cases are not commonly used for the three objects in the original sentence. Figure 3 presents the relational model that corresponds to the given Q1 Matrix answer vector.

sharp3.gif (3777 bytes)

Figure 3: Restrictions on city name assignments that limit the assignments of cities.

Rules for the population of the three EFTs may already exist in the subject area or the analyst can point out that a new set of rules are available for populating these fact types.

EFT 1: Person identified by <SSN> is assigned <CityName>, <StateAbb>.

EFT 2: Person identified by <SSN> is assigned to <StateAbb>.

EFT 3: Person identified by <SSN> is assigned to the <CityName> city name.

Depending on the subject area rules, either EFT 2 or EFT 3 could be populated first and then EFT 1 could be populated. If EFT 1 is populated first then the subject area rules could require that either EFT 2 or EFT 3 is populated at the same time. These three fact types also define the knowledge that can be reported from the database. Even if there are no population rules that deal with EFT 2 and EFT 3, this knowledge can be reported separately from EFT 1. It would be very important that the reader of the report must fully understand that EFT 3 only talks about a city name instead of a particular city. The state information in EFT 2 may routinely be reported from the database without any problem because the state placeholder refers to a real world object.

Summary

An experienced analyst should be able to understand the issues involved in all aspects of storing knowledge and effectively reporting that knowledge to others. This simple example shows how we can use our precise modeling skills to assist subject matter experts in developing more effective processes and in making sure that only meaningful reports are developed for users.

If you are interested in a more detailed understanding of the Natural Language Modeling procedure used in this example, then you can down load an NLM Example Problem from my web site: www.sharp-informatics.com. This example shows how the NLM procedure can be used to validate an existing model.

Happy Modeling!

Dr. John Sharp is the founder and principal consultant for Sharp Informatics.Before starting Sharp Informatics in 1997 he was employed by Sandia National Laboratories in Albuquerque, NM for 18 years. While at Sandia he held staff and management positions in all areas of information technology, including analysis, design, implementation, maintenance, information architecture, data administration, and information technology research. He has worked closely with Prof. Shir Nijssen of The Netherlands to improve the NIAM analysis methodology. Dr. Sharp is the creator of the first information analysis procedure known to be mathematically precise.This procedure reformulates the usual (imprecise and inaccurate) statements and examples from a subject area into verified fact types. The output of this productivity enhancing process (a set of information requirements) is compatible with all the latest and most productive database application creation tools. John is the editor of the international standard for conceptual schemas. He has co-chaired two international conferences on natural language modeling and he has presented numerous papers and seminars at professional conferences.

Contact information:

Dr. John Sharp
Sharp Informatics
1604 Vassar SE
Albuquerque, NM 87106
sharp@sharp-informatics.com
505-243-1498
fax 505-248-0345
http://www.sharp-informatics.com

© Copyright, 1998-2004 InConcept (Information Conceptual Modeling, Inc.) All Rights Reserved. Privacy Statement.
ISSN: 1533-3825