December
2001 Issue: 23
Journal of Conceptual Modeling
www.inconcept.com/jcm
Sharp
Informatics Example Problem
by
Dr. John K. Sharp
The following simplified example shows how Natural Language Modeling can be used to express three distinct kinds of knowledge that results in 1) database requirements, 2) derivation rules for knowledge presentation to humans or automated rule enforcement, and 3) instructions for humans to follow. Natural Language Modeling always starts with example(s) of the problem that is to be modeled. In this case the examples are taken from an instruction manual for security guards and a legacy security application. Subject matter expert(s) are asked to create a true sentence based on the example information. Results from the analysis are presented here. The NLM procedure will turn any true declarative sentence into instance(s) of valid fact type(s). For brevity the example is not wholly defined. This analysis technique scales linearly. That means that knowledge is defined only once. Then as the scope of a project expands, previously defined knowledge does not need to be reanalyzed.
This example problem is of a physical security area around a building. The security perimeter is monitored by motion- and thermal-detection sensors. The sensors function according to a fixed set of rules. The sensor knowledge is presented to guards according to another fixed set of rules. Guards respond to sensor activity according to a third set of rules. The problem is expressed below in a set of diagrams (drawings and screens) and excerpts from an existing application and a training manual. A detailed analysis of one fact type is presented, and only the summary information is presented for the other fact types.
Sensor Application Code (Lines 978 - 989)
//** A sensor light is red when movement is detected and a thermal signature
between 95 and 105 is detected at that sensor. A sensor light is yellow when
**//
Guard Training Manual (Page 21)
Guard instruction for responding to the security sensor code alerts:
For a sensor code color of green a guard is to review all sensors once every
thirty minutes via video monitors.
For a sensor code color of yellow a guard is to review action at the sensor via video
monitors until the reason the sensor turned on is established or the sensor code color
returns to green.
For a sensor code color of red a guard is to alert all guards on duty and review action at
the sensor via video monitors for at least 15 minutes or until the reason is established.
Guard Training Manual (Page 35)
The following lists provide the ordered appropriate action responses once a
reason for the sensor code alert has been established:
single person
1 determine if the perimeter fence has been violated
2 scan the intruder to detect the presence of weapons
3
.
animal
1 determine if the perimeter fence has been violated
2
.
The presentation of the analysis results here assists in the understanding of the precision available with the Natural Language Modeling procedure. The presentation is not structured to be an encompassing tutorial that explains all possible analysis results. Nor is it structured to show how valid fact types can be generated from any initial sentence.
Step 1: Verbalization and Highlighting
The analyst asks the subject matter expert to transform a portion of the previous data into sentences that the expert would use in explaining this information to a colleague. This first sentence comes from the Current Sensor State form. The number of data elements that appear in a particular sentence is not important, but there is a requirement that every data element appears in a sentence. This requirement is achieved by highlighting the individual data elements as they appear in sentence(s). This is the initial step used in NLM to establish and/or validate the knowledge rules behind any form, report, graphical information model or legacy application data structure.
The expert answers:
"On 7-16-97 at 16:55, active sensor 3867 detected movement and recorded a
thermal signature of 99 degrees F."
The analyst requests another sentence that provides a different data instance for each of the portions of the original sentence that can vary. The original sentence is a true statement about the subject area under investigation. This second sentence does not need to be a true statement within the subject area because yet unspecified business rules may restrict some or all of the instances in the original sentence from varying at the same time (e.g. The original sentence is not normalized.) or it may just be the case that at this time another true statement may not exist in which every element varies. The portion of the sentence that is restricted from varying in this analysis is the verb.
The expert
answers:
"On 7-18-97 at 7:55, inactive sensor 3413 detected no movement and
recorded a thermal signature of 62 degrees F."
Step 2: Placeholder Assignment
The analyst then restates the two sentences and requests that the expert validate the portions of the sentence that are to be assigned placeholders during analysis.
"On 7-16-97 at 16:55, active
sensor 3867 detected movement and recorded a thermal signature of 99
degrees F."
"On 7-18-97 at 7:55, inactive sensor 3473 detected no
movement and recorded a thermal signature of 62 degrees F."
Step 3: Qualification
The analyst requests names for the class, label, and placeholder location for each varying element.
"Of which class are the elements referred to by 3867
and 3473 a member?"
The expert answers:
"Sensor."
The analyst asks:
"What is the name used (label) for an individual element of the
population, 3867, of the class sensor?"
The expert answers:
"Sensor Identifier."
The analyst asks:
"What would you like to name the placeholder for the position where 3867
and 3473 appear in this sentence?"
The expert answers:
"SensorId."
These three questions are repeated for each placeholder. As instances reappear in sentences only the placeholder name for the corresponding position in the sentence need to be requested. The analyst concludes this step by creating a candidate fact type for evaluation:
1: On <Date> at <Time>, <Status>
sensor <SensorId> detected <MovementReading> and recorded a thermal signature
of <Temperature> degrees F.
Step 4: Fact Type Validation
The first step in the validation of a candidate fact type is to ask if it is ever allowed for a new sentence that differs by only a single variable placeholder to coexist with the original true sentence. Redundancy between the two sentences is ignored because the sentences are not normalized. The results from this step are presented in matrix form as:
1: On <Date> at <Time>, <Status>
sensor <SensorId> detected <MovementReading> and recorded a thermal signature
of <Temperature> degrees F.
7-16-97 |
16:55 |
active |
3867 |
movement |
99 |
|
------------ |
------------ |
------------ |
------------ |
------------ |
------------ |
Allowed? |
another |
16:55 |
active |
3867 |
movement |
99 |
Yes |
7-16-97 |
another |
active |
3867 |
movement |
99 |
Yes |
7-16-97 |
16:55 |
another |
3867 |
movement |
99 |
No |
7-16-97 |
16:55 |
active |
another |
movement |
99 |
Yes |
7-16-97 |
16:55 |
active |
3867 |
another |
99 |
No |
7-16-97 |
16:55 |
active |
3867 |
movement |
another |
No |
The answers for individual rows in the matrix are generated using the following question. "Given that "On 7-16-97 at 16:55, active sensor 3867 detected movement and recorded a thermal signature of 99 degrees F." is true, is it possible for another valid Date [for example "7-17-97"] to exist such that the fact instance "On 7-17-97 at 16:55, active sensor 3867 detected movement and recorded a thermal signature of 99 degrees F." is true?"
The NLM analysis procedure uses the Yes/No answer vector to either generate new candidate fact type(s) or to validate that the current fact type is an elementary fact type (normalized fact type). This analysis procedure depends only on the Yes or No answers provided by the subject matter expert. Thus, the subject matter expert is fully accountable for the resulting information model. This example problem will not be fully analyzed using the NLM analysis procedure here. This shortcut is taken to focus on the results of the analysis for this problem. One of the elementary fact types contained in sentence 1 will be presented in the remaining analysis steps in order to provide an example for each step.
Step 5: Pattern Recognition
One of the elementary fact types generated from sentence 1 has the pattern:
FT-1: On <Date> at <Time> sensor <SensorId> is
<Status>.
Step 6: Diagramming
This knowledge generated about this elementary fact type can be presented graphically as:
FT-1 diagram
Step 7: Additional Population Restrictions
The NLM analysis procedure assigns the required referential integrity (foreign keys) required among the elementary fact types, but other population restrictions need to be investigated once an elementary fact type is established. The type of additional population restriction evaluated here is the derivation rule. This population restriction is specified by the subject matter expert through the expert's response to the following question: "Is a status instance always or sometimes derived in the "On <Date> at <Time> sensor <SensorId> is <Status>." fact type?"
For this question the expert would respond No. If the response to this question is Yes, then the expert would be requested to describe the derivation rule using other fact types defined in the analysis. (An illustrative case of this will be found in the discussion of FT-4 below.)
Step 8: Summary
The results from the previous steps can be presented in a condensed form as:
FT-1: On <Date> at <Time> sensor <SensorId> is
<Status>.
On 7-16-97 at 16:55 sensor 3867 is active.
"7-16-97" is a Date of a Day.
"16:55" is a Time of a Day.
"3867" is a Sensor Identifier of a Sensor.
"active" is a Status of a Sensor State.
On <Date> at <Time> sensor <SensorId> is <Status>.
7-16-97 |
16:55 |
3867 |
active |
|
------------- |
------------- |
------------- |
------------- |
Allowed? |
another |
16:55 |
3867 |
active |
Yes |
7-16-97 |
another |
3867 |
active |
Yes |
7-16-97 |
16:55 |
another |
active |
Yes |
7-16-97 |
16:55 |
3867 |
another |
No |
Are instances of this fact type derived? No
Remaining Fact Type Summaries
Other significant elementary fact types will be provided in summary form so that the results of the NLM analysis can be discussed.
FT-2: On <Date> at <Time> sensor <SensorId> detected
<Movement>.
On 7-16-97 at 16:55 sensor 3867 detected movement.
"7-16-97" is a Date of a Day.
"16:55" is a Time of a Day.
"3867" is a Sensor Identifier of a Sensor.
"movement" is a Movement Reading of a Movement Sensor.
On <Date> at <Time> sensor <SensorId> detected
<MovementReading>.
7-16-97 |
16:55 |
3867 |
movement |
|
------------- |
------------- |
------------- |
------------- |
Allowed? |
another |
16:55 |
3867 |
movement |
Yes |
7-16-97 |
another |
3867 |
movement |
Yes |
7-16-97 |
16:55 |
another |
movement |
Yes |
7-16-97 |
16:55 |
3867 |
another |
No |
Are instances of this fact type derived? No
FT-3: On <Date> at <Time> sensor <SensorId> has a thermal
signature of <Temperature> degrees.
On 7-16-97 at 16:55 sensor 3867 has a thermal signature of 99 degrees.
"7-16-97" is a Date of a Day.
"16:55" is a Time of a Day.
"3867" is a Sensor Identifier of a Sensor.
"99" is a Temperature of a Thermal Sensor.
On <Date> at <Time> sensor <SensorId> has a thermal signature
of <Temperature> degrees.
7-16-97 |
16:55 |
3867 |
99 |
|
------------- |
------------- |
------------- |
------------- |
Allowed? |
another |
16:55 |
3867 |
99 |
Yes |
7-16-97 |
another |
3867 |
99 |
Yes |
7-16-97 |
16:55 |
another |
99 |
Yes |
7-16-97 |
16:55 |
3867 |
another |
No |
Are instances of this fact type derived? No
FT-4: On <Date> at <Time> sensor <SensorId> violates rule
<RuleNo>.
On 7-16-97 at 16:55 sensor 3867 violates rule 1.
"7-16-97" is a Date of a Day.
"16:55" is a Time of a Day.
"3867" is a Sensor Identifier of a Sensor.
"1" is a Rule Number of a Rule.
On <Date> at <Time> sensor <SensorId> violates rule
<RuleNo>.
7-16-97 |
16:55 |
3867 |
1 |
|
------------- |
------------- |
------------- |
------------- |
Allowed? |
another |
16:55 |
3867 |
1 |
Yes |
7-16-97 |
another |
3867 |
1 |
Yes |
7-16-97 |
16:55 |
another |
1 |
Yes |
7-16-97 |
16:55 |
3867 |
another |
Yes |
Are instances of this fact type derived? Yes
The instance values and the parameter names for each point in time in FT-2 and FT-3 are
passed through the rule checker FT-5 to determine if the sensor is in violation of any
rule. This derived fact may or may not be permanently stored in the resulting application.
FT-5: Rule <RuleNo> requires <ParameterName> to be
<OperatorName> <ParameterValue>.
Rule 1 requires temperature to be greater than 96.
"1" is a Rule Number of a Rule.
"temperature" is a Parameter Name of a Parameter.
"greater than" is an Operator Name of an Operator.
"96" is a Parameter Value of a Parameter.
Rule <RuleNo> requires <ParameterName> to be <OperatorName>
<ParameterValue>.
1 |
temperature |
greater than |
96 |
|
------------- |
------------- |
------------- |
------------- |
Allowed? |
another |
temperature |
greater than |
96 |
Yes |
1 |
another |
greater than |
96 |
Yes |
1 |
temperature |
another |
96 |
Yes |
1 |
temperature |
greater than |
another |
No |
Are instances of this fact type derived? No
Note: the other parts of Rule 1 are:
Rule 1 requires temperature to be less than 105.
Rule 1 requires movement reading to be equal to "movement."
This implementation of rules in FT-5 mixes meta-data (e.g. temperature) and data
(e.g. 96). Fact types 2 and 3 could be rewritten as: On <Date> at <Time>
parameter <ParameterName> has parameter value of <ParameterValue>. This was
not done in this case because for most subject areas the business rules change more often
than the types of data that are used in the rules.
FT-6: A rule <RuleNo> violation requires a <LightColor> sensor
light.
A rule 1 violation requires a red sensor light.
"1" is a Rule Number of a Rule.
"red" is a Light Color of a Sensor Light.
A rule <RuleNo> violation requires a <LightColor> sensor light.
1 |
red |
|
-------- |
--------- |
Allowed? |
another |
red |
Yes |
1 |
another |
No |
Are instances of this fact type derived? No
FT-7: A <HigherLightColor> sensor light has priority over a
<LowerLightColor> sensor light.
A red sensor light has priority over a yellow sensor light.
"red" is a Light Color of a Sensor Light.
"yellow" is a Light Color of a Sensor Light.
A <HigherLightColor> sensor light has priority over a
<LowerLightColor> sensor light.
red |
yellow |
|
-------- |
--------- |
Allowed? |
another |
yellow |
Yes |
red |
another |
Yes |
Does red, yellow at any moment in time identify exactly one light color having
priority over another light color? Yes
Are instances of this fact type derived? No
FT-8: On <Date> at <Time> sensor <SensorId> has a sensor
light color of <LightColor>.
On 7-16-97 at 16:55 sensor has a sensor light color of red.
"7-16-97" is a Date of a Day.
"16:55" is a Time of a Day.
"3867" is a Sensor Identifier of a Sensor.
"red" is a Light Color of a Sensor Light.
On <Date> at <Time> sensor <SensorId> has a sensor light color
of <LightColor>.
7-16-97 |
16:55 |
3867 |
red |
|
------------- |
------------- |
------------- |
------------- |
Allowed? |
another |
16:55 |
3867 |
red |
Yes |
7-16-97 |
another |
3867 |
red |
Yes |
7-16-97 |
16:55 |
another |
red |
Yes |
7-16-97 |
16:55 |
3867 |
another |
No |
Are instances of this fact type derived? Yes
The results from FT-4 derivation rule are used along with FT-6 and FT-7 to determine the
sensor light color.
FT-9: On <Date> at <Time> sensor <SensorId> has an intrusion
reason of <ReasonText>.
On 7-16-97 at 16:55 sensor has a reason of single person.
"7-16-97" is a Date of a Day.
"16:55" is a Time of a Day.
"3867" is a Sensor Identifier of a Sensor.
"single person" is a Reason Text of an Intrusion Reason.
On <Date> at <Time> sensor <SensorId> has a reason of
<ReasonText>.
7-16-97 |
16:55 |
3867 |
single person |
|
------------- |
------------- |
------------- |
------------- |
Allowed? |
another |
16:55 |
3867 |
single person |
Yes |
7-16-97 |
another |
3867 |
single person |
Yes |
7-16-97 |
16:55 |
another |
single person |
Yes |
7-16-97 |
16:55 |
3867 |
another |
Yes |
Are instances of this fact type derived? No
FT-10 A <ReasonText> intrusion reason has step <StepNumber> that
states: <InstructionText>.
A single person reason for sensor code color has step 1 that states: determine if
perimeter fence has been violated.
"single person" is a Reason Text for an Intrusion Reason.
"1" is a Step Number of an Intrusion Step Number.
"determine if perimeter fence has been violated" is Instruction Text of an
Instruction.
A <ReasonText> intrusion reason has step <StepNumber> that states:
<InstructionText>.
single person |
1 |
determine . |
|
-------------- |
-------- |
--------------- |
Allowed? |
another |
1 |
determine . |
Yes |
single person |
another |
determine . |
No |
single person |
1 |
another |
No |
Are instances of this fact type derived? No
Note: FT-10 has two identifiers. One is Intrusion Reason and Step Number. The other
is Intrusion Reason and Instruction.
FT-11: A <LightColor> sensor light requires a guard to take the action
of <ActionText>.
A green sensor light requires a guard to take the action of review all sensors
every thirty minutes via video monitors.
"green" is a Light Color of a Sensor Light.
"review all
" is Action Text of an Action.
A <LightColor> sensor light requires a guard to take the action of
<ActionText>.
green |
review all |
|
-------- |
--------- |
Allowed? |
another |
review all |
Yes |
green |
another |
Yes |
Does green, review all
at any moment in time identify exactly one light
color requiring a guard to take the action? Yes
Are instances of this fact type derived? No
FT-12: A <LightColor> sensor light exists.
A green sensor light exists.
"green" is a Light Color of a Sensor Light.
A <LightColor> sensor light exists.
green |
|
-------- |
Allowed? |
another |
Yes |
Does green at any moment in time identify exactly one light color? Yes
Are instances of this fact type derived? No
All of the previous fact types are displayed in the following relational diagram.
FT-1 to FT-12 diagram
This simple example shows how both data rules (table, columns, keys, etc.) and business rules (when to turn the sensor light red, what are the response steps, etc.) are modeled using NLM. The benefit of modeling the entire set of knowledge is that the business rules can be specified as data to the application and they can maintained by the expert without coding changes being required for implementing a new rule or modifying or deleting an existing rule. This structure is valuable because in most businesses the business rules (e.g. A director has to approve all purchases over $10,000.00.) change more often than the bulk of the data maintained (e.g. Knowledge specified on a purchase order.). Questions arise from this analysis that should be addressed before the model is finalized including determining if the prioritizing of Action Text and Reason Text is needed. This complete documentation of knowledge enables the automatic generation of manuals and training programs that use the precise rules that are implemented in the application.
The subject matter expert(s) can answer each of these questions and then the answer may be validated by any qualified expert. This up-front precision in information requirements greatly improves communication of requirements among the subject matter experts, analysts, and implementers. The improved communication of requirements virtually eliminates all of the rework and unfinished projects that are due to misunderstandings of the requirements among various participants. The knowledge captured using the NLM procedure can be documented in an ORM or NIAM CASE tool or mapped directly into relational tables.
![]()
Dr. John Sharp is the founder and principal consultant for Sharp Informatics.Before starting Sharp Informatics in 1997 he was employed by Sandia National Laboratories in Albuquerque, NM for 18 years. While at Sandia he held staff and management positions in all areas of information technology, including analysis, design, implementation, maintenance, information architecture, data administration, and information technology research. He has worked closely with Prof. Shir Nijssen of The Netherlands to improve the NIAM analysis methodology. Dr. Sharp is the creator of the first information analysis procedure known to be mathematically precise.This procedure reformulates the usual (imprecise and inaccurate) statements and examples from a subject area into verified fact types. The output of this productivity enhancing process (a set of information requirements) is compatible with all the latest and most productive database application creation tools. John is the editor of the international standard for conceptual schemas. He has co-chaired two international conferences on natural language modeling and he has presented numerous papers and seminars at professional conferences.
Contact information:
Dr. John Sharp
Sharp Informatics
1604 Vassar SE
Albuquerque, NM 87106
sharp@sharp-informatics.com
505-243-1498
fax 505-248-0345
http://www.sharp-informatics.com
![]()
© Copyright, 1998-2004 InConcept
(Information Conceptual Modeling, Inc.) All
Rights Reserved. Privacy Statement.
ISSN: 1533-3825