September 2004        Issue: 33

Journal of Conceptual Modeling
www.inconcept.com/jcm

 

Systems Analysis and RAD:
Interpreting Grammar for Software Development

By Brian S. Smith

This article is the second in a three-part series on how system requirements can be used to accelerate the software development process.

This article takes a closer look at the importance of system requirements for project success, contrasts a requirement-centric approach with Extreme Programming (XP), and explores the potential of natural language processing for systems analysis. The challenges of interpreting English grammar for software development are discussed, along with how rapid application development (RAD) can be achieved from the written word using Deterministic Phraseology (DP).

What vs. How

Every software development project is faced with two fundamental questions: what must the system do; and how should it be done. The answers to these questions are formulated during different phases of the systems development lifecycle (SDLC)—“what” during the analysis phase, and the “how” during the design and implementation phases. It is not unusual, however, for the distinction between “what” and “how” to become blurred, particularly with respect to who is responsible for supplying those answers. This wrangling over preliminary design authority can become one of the biggest sources of contention among systems and software engineers on a medium-to-large scale project. 

If the “what” could be clearly specified in words, not in pseudo code or structured English, but in straightforward “shall” statements that directly translate into a framework for developers to quickly access and incorporate in their design, much of this conflict could be avoided. With such a means of direct translation available, the interpretive role of the analyst could be significantly reduced, and words would dictate the model. But are formal system requirements really necessary? 

Extreme Programming and Spiral Development Strategies 

Methodologies involving joint application development (JAD) and evolving prototypes have gained popularity over the last decade, most notably with the popularization of Extreme Programming (XP). Proponents of XP tout the advantages of “growing” a system through rigorous testing and customer participation. Without question, the benefits of close customer involvement and testing are enormous, regardless of methodology. Yet while XP has proven it can work without several of the cumbersome and labor-intensive artifacts of the traditional SDLC, there are substantial risks inherent in any spiral development methodology. When formal analysis and design are bypassed in favor of a hyper-iterative system evolution strategy, what might appear effective or ideal during one iteration may prove suboptimal—or a real showstopper—in the final release. This is the “locally optimal, globally suboptimal” situation that can often manifest itself during later stages of a development project. 

 Figure 1 – XP’s Spiral Development Methodology

XP’s emphasis on “user stories” and iteration is not very different from earlier approaches involving use cases 1, and the insistence on high customer involvement is a fundamental concept of JAD. Both use cases and JAD have been proven effective for obvious reasons: use cases help identify user interfaces and expected system behavior, while customer involvement ensures the project is on track and delivering the desired functionality. Considering XP’s reliance on these earlier techniques and methodologies, including pair programming—where “buddies” share in the development and inspection of code, XP’s real contribution to RAD lies in its “test-first” approach to development. Unlike the traditional approach, with its rigid sequence of design, develop, integrate and test, XP encourages writing both unit and integration test code before the actual system code. 

In the absence of a full set of quality system requirements, spiral development methodologies, such as XP, offer distinct advantages in that they foster discovery of both required and desired functionality as the system evolves. Moreover, there is always a working system, and with every new release both customer and development organization alike gain a sense of security, promise, and progress. But while “its team-based approach works well for smaller projects, it scales up poorly to larger projects…and places too little emphasis on analysis and architectural design.” 2 

There really is no substitute for a complete set of quality requirements, and studies have shown that projects with an adequate set of requirements have the greatest chance of success 3. In fact, the “waterfall approach” to system development, which has fallen into some disfavor with the rise of object-oriented methodologies, can work well—and may actually be the best approach—when a project is fortunate enough to have received, or developed, a clear and concise body of system specifications. 

Turning Requirements into Software 

RAD is the objective, after all, because development just takes too long. Methodologies like XP are gaining favor because they can make the schedule on time—without working engineers into the ground. And yet quality requirements can go even further to ensuring project success; the problem is how to quickly turn them into products useful to software developers. 

As with any methodology, a computer-aided systems engineering (CASE) tool is essential, and more importantly a methodology that can convert English into suitable models for software development. This is the challenge facing natural language processors that seek to accelerate the analysis process: how to put enough constraints on phrasing to ensure proper composition, while allowing the analyst to easily express structure, relationships, behavior and state. The ideal, of course, is absolute freedom of expression. But with every degree of freedom granted, the complexity of grammatical algorithms also increases, sometimes exponentially, and can lead to algorithmic ambiguity (where two rules of logic actually contradict one another). So some level of constraint must be imposed, and yet the goal must be as little as possible; otherwise the tool cannot be aptly deemed a “natural” language processor. 

Modeling Based on English 

Those CASE tools that assist in the crafting of requirements are undeniably great assets to the systems engineers that use them. And any CASE tool that can transform system requirements into logical models, automatically, could certainly go a long way toward shortening a project’s systems analysis phase. Such a tool, however, faces one enormous obstacle: language. To accomplish both the enforcement of quality requirement composition, and the transformation of those requirements into accurate object and data models, requires a split focus. The secondary focus must be on ease-of-composition and syntactical validation, while the primary focus must be on the translation of those requirements into components of a software modeling methodology, such as the Unified Modeling Language (UML) from the Object Management Group (OMG). 

Consider the relational model 4 and UML. These methodologies are near-universal in that they can be followed by Bolivians, Pakistanis, Italians, Germans, Koreans, Americans, etc., regardless of written language. A CASE tool, however, that interprets natural language for conversion into UML models is—by its very definition—bound to the trappings of a specific language. Theoretically, such a CASE tool could support inter-language translation, but the complexities would be formidable, and the application would presumably support only a small subset of the human languages in use today. 

Language, therefore, is the true challenge; and while model-generation for the purpose of RAD is the objective, it should not encumber composition to meet its objective. As stated earlier, with every constraint on expression, the usefulness of such a tool diminishes because a human being is an integral part of the equation. Language should be natural, but should conform to proper grammar. Assuming English is the supported language, the methodology behind the tool—the interpretive engine that makes conversion into logical models possible—must, in practice, rely on some input controls. 

Deterministic Phraseology 

DP is the engine behind a CASE tool from Leap Systems (http://www.leapse.com) that automatically converts “shall” statements, composed in English, into object and data models of the subject system. Inputs are loosely constrained by over 20 predefined templates that may be used in full or in part, only requiring that users “type-and-tab” their requirements into sequential fields. A Requirement Builder is also provided to create ad hoc templates that fully exercise the potential of DP.  

DP is a methodology that operates within the confines of established UML principles. It interprets the arrangement and combination of grammatical “clauses” that comprise a sentence and generates classes, complete with support for enumerated types, whole-part relationships, inheritance, association, and behavior.  

The following set of figures highlight some of the principle DP clauses, and also provide insight into how DP works. A sample requirement (A) is presented, followed by how the requirement appears in a template used for DP processing (B). 

A. Example Requirement

The system shall provide the capability to reestablish G/G communications when the error rate of the received signal has changed from within limits to outside limits if the status of the G/G communications is degraded or is failed.

 B. Leap SE Template #10: Event with Condition 

Notice that the first “what attribute?” field is not utilized in the template. Only the red-colored text fields are required. Also notice the first yellow-colored field appears blank. This is the “role” field that becomes available when a role-type preamble is chosen from the System drop-down list. Hovering over any field displays a tool tip indicating whether a noun, verb, adjective, preposition, etc., is required. 

 Figure 2 - Leap SE’s Template #10: Event with Condition 

C. Example Requirement with Bars Separating Preamble and Clauses 

In DP, a requirement is decomposed into a preamble and a string of inter-related clauses, shown below with separating bars. 

The system shall provide the capability | to reestablish G/G communications | when the error rate of the received signal has changed | from within limits | to outside limits | if the status of the G/G communications | is degraded | or is failed.

D. “Clause” View of Requirement 

Preamble | “do” clause | “when” clause | “state” clause | “state” clause” | “if” clause | “state” clause | “state clause

 E. Component View of Requirement 

Beyond the “clause” view, the requirement is further broken down into individual components that are ultimately processed by the DP engine. Here is how the clauses are perceived at this level.

do
 
   attribute
    type
    entity
 when
 
   attribute
    type
    entity
 action
 from
 
   value
 to
 
   value
 if
 
   attribute
    type
    entity
 is
 
   value
 or
 is
 
   value

DP validates the components comprising each clause, as well as the combination of clauses put together in a particular sequence. Before saving, the requirement can be previewed for integrity. Once saved, the requirement is fully integrated into the CASE tool’s object model database from which all models are drawn. 

Generating an object model with the CASE tool produces a directory of well-defined C++ header files, while generating a data model produces a Structured Query Language (SQL) file for running in a relational database management system (RDBMS), which in turn creates tables and relationships complete with referential integrity.

Problem Space vs. Solution Space Modeling 

One argument that often arises when discussing the translation of system-level requirements into logical models is the suitability of the resulting models for software development. The problem space, as defined by the relationships, characteristics, and behavior of domain entities, is not the solution space. Simply put, the logical model is not the physical model. 

The logical model represents the interaction and definition of entities in “the real world”, in the functional domain of the system to be built, replaced, or upgraded, all of which is derived from source material such as a System Requirements Specification (SRS), Concept of Operations, use cases, and similar technical documents. From these materials, the analyst develops logical models to characterize these domain entities, their inter-dependence, and their interactions. 

But unlike the logical model, the physical model is concerned with implementation, and its view of the domain is of quite another abstraction. And yet the data associated with the system’s logical entities, and the fundamental relationships between those entities, often carry through to the system’s physical models. This is not to say there aren’t numerous supporting objects in the solution space that have little or nothing to do with the logical view, such as interface and abstract classes, along with webs of inheritance and composition (ideally reflecting a normalized view of the data), but the fundamental domain entities invariably survive. Moreover, these propagated entities are typically found at the heart of the application code. 

So the question becomes: “Can a declarative language, such as English, be used to specify all the software constructs necessary to build a system?” That is the frontier, and the challenge. And yet as logical modeling pushes down into the physical, and as implementation becomes the focus, natural language gives way to “techno-speak”, English phrasing of terms that have virtually no meaning beyond the level of software implementation. 

The DP CASE tool (Leap SE) makes no distinction between real words and industry-specific, technological jargon in its template and Builder fields, so the transition to a lower level of specification is readily accommodated. In addition, its various drop-down lists, which provide access to prepositions, conjunctions, and relational terms, keep phrasing grammatically correct for generating both object and data models. So while the primary objective of DP is to expedite logical model development, the CASE tool and methodology lend themselves to extension into the actual solution space. 

 Figure 3 - Leap SE’s Requirement Builder 

Object-orientation, UML and DP 

The object-oriented revolution of the 1980s and 1990s asserted that an object view of the problem space is superior to that of a functional view because we, as human beings, perceive our world in terms of objects and not functions; that the “behavior of something” is secondary to the essence of the thing itself. In a nutshell, that data is more important than behavior. Immediately, a connection was made between this view and the relational model. The English language, in particular, with its emphasis on subjects and objects, linked by verbs, seemed to support this view of software as “chatter among interacting objects” that pass in and out of existence as needed. So the creators of DP embraced UML and developed a CASE tool to turn those English sentences, or “shall” statements, into UML models. 

Using DP to develop a system requirements repository offers distinct advantages in the areas of object, attribute, and method management. By being able to inventory an object’s attributes and methods, new requirements can be drafted that re-use identifiers already in use—identifiers derived from requirements previously created with the DP CASE tool. In this way, different words or synonyms for the same object, attribute, or method can be avoided, eliminating both ambiguity and duplication. This natural enforcement of consistency promotes the development of a cohesive set of system requirements that accelerates analysis for its ultimate purpose—software development. Drawing from the entire requirements database, class header files and SQL output can be generated at any time—which is ideal for iterative development—and the resulting UML models can be traced to the requirements that contributed to their definition. 

The next article in this series, “RAD from System Requirements: A Study of Deterministic Phraseology” will focus on the challenges of interpreting English for software development and take a closer look at DP. 

References: 
1
Use Cases combined with Booch/OMT/UML: Process and Products, Putnam P. Texel, Charles B. Williams, Prentice Hall, 1997.
2
Systems Analysis and Design in a Changing World, Third Edition, John W. Satzinger, Robert B. Jackson, Stephen D. Burd, Course Technology, Thomson Learning, 2004)
3
The Standish Group International, The Standish Group’s CHAOS Report, 1995.
4
“A Relational Model of Data for Large Shared Data Banks,” Association for Computing Machinery (ACM), Vol. 13, No. 6, pp. 377-387, Dr. Edgar Frank Codd, June 1970.

 Brian S. Smith
Leap Systems
http://www.leapse.com
techsupport@leapse.com

© Copyright, 1998-2004 InConcept (Information Conceptual Modeling, Inc.) All Rights Reserved. Privacy Statement. ISSN: 1533-3825