Click Here to Install Silverlight*
IndiaChange|All Microsoft Sites
MSDN
   
   
Conceptual Modeling – a technique for robust database design    
   

  Author: Poornachandra Sarang


Practically every software application uses and maintains some database. The proper design of this database plays a vital role in the eventual success of the application software. There are many techniques available for database design. One of the most popular and age-old proven technique is the E-R modeling. In E-R modeling, one defines the entities and the relationships between them while modeling the real-life cases. Though E-R modeling techniques are quite popular, it has several shortcomings. Some of the shortcomings are discussed later in the article. Also, there is not much standardization in E-R diagram notations and one usually comes across several standards such as Oracle ER, IDEF1X, etc.
 
Another popular notation for modeling systems is UML (Unified Modeling Language). UML is widely used. However, it starts in the Logical Design phase of the software development cycle. The UML diagrams are not intuitive and difficult to understand for a non-IT professional. Thus, when a system analyst and designer designs an application software, s/he may find it difficult to explain the design properly and completely to an end-user who typically is a non-IT professional. This usually results in software changes during and after the deployment phase where the end-user starts using the software. Such changes usually become costly at this stage of software development cycle.
 
What we need is modeling the requirements at the conceptual level without bothering about the details of the classes, objects and their interactions. These things do not make much sense to the end-user. They are useful only to designers and developers of the application software. At the conceptual level, the end-user will be able to get a better grasp of the analyst’s understanding of the domain problem. Once the design at the conceptual level is approved, a designer can proceed with the logical design phase where we talk about top-level classes, use-cases, etc.
 
Object Role Modeling (ORM) techniques for conceptual modeling come to our rescue in this case. Though ORM is less known, it is very useful in precisely capturing the requirements, modeling the requirements and ascertaining the correctness of the model based on data use cases.
 
Object Role Modeling
 
ORM was originally developed in Europe in 1970s. Dr. Terry Halpin (Microsoft Corp.) did considerable work on ORM. The conceptual modeling using ORM is available as a part of Visual Studio Enterprise Architect edition.
 
ORM uses a fact-oriented method for information analysis and modeling. In real-life, when we capture user requirements during system analysis, we express several facts in our requirements gatherings. For example, one of the use-cases in our requirement gathering may state “Sameer drives a Car”. This is a fact that is expressed in natural language. What do we deduce from this fact? This fact identifies two objects, Sameer and a Car. The fact that Sameer drives a Car indicates that object called Sameer plays a “role” of driving a “Car”. In practice, every object has a certain role to play over another object. We capture such facts from the data existing in a corporation to do analysis and modeling. Based on these facts, we use ORM techniques for modeling the system.
 
Using ORM techniques, once a model is created, the correctness of the model can be verified by verbalizing this model in natural language. The end-user (non-IT professional) can easily understand the model description expressed in natural language. Both E-R and UML techniques do not allow the verbalization of the model in natural language, thereby making it difficult for non-IT professionals to understand it as usually such professionals do not fully understand the notations used in those drawings.
 
While modeling the system, ORM does not require you to define classes. Thus, the entire design is at a conceptual level where the discovered objects are not mapped to concrete classes at this stage of the software cycle. The created model may be represented using intuitive diagrams or expressed in natural language.
 
The ORM techniques also offer another advantage over ER and UML techniques that the created models may be verified against the datasheets ensuring its correctness. This is explained further with an example.
 
I shall now briefly describe the ORM process. To do a conceptual design using ORM techniques, we start with data use cases.
 
Modeling Data Use Cases
 
A typical data use case is shown in the table below. This data table essentially lists the company drivers along with the cars they are allowed to drive. The first and the last name of each driver along with the Employee ID are listed. The registration number of the car and its color are also listed.
 
Employee ID Employee First Name Employee Last Name Car Registration Number Car Color
001 Sameer Desai MH04N2002 Blue
002 Ashok Deshpande MH02A1556 Red
003 Nilesh Pande MH02D2557 Green
004 Ramesh Patki MH04N1589 Silver
001 Sameer Desai MH02D2557 Green
 
What facts can you deduce from the above table?
 
One can deduce several facts from the above data table; we will discuss these one by one. The first thing that we notice is that there are two primary entities, an Employee uniquely identified by its ID and a Car uniquely identified by its registration number. There is an association between an Employee object and a Car object. We may say that each employee drives a certain car. Thus, the first fact that we deduce from the above table is that “Employee drives a Car”. In ORM notation, this fact is expressed using following diagram:
 
 
The entities in ORM notation are shown with ellipses drawn in solid pen. The Employee and the Car are two entities in the above diagram. The “drives/is driven by” is a predicate. The relationship between these two entities is binary. An employee drives a car and a car is driven by some employee. The binary relationship is indicated by two solid rectangles.
 
Sometimes, the role that an object plays may be unary. For example, the fact “An Employee named John smokes” can be represented using following notation.
 
 
Incidentally, UML notation does not provide any notation for representing unary facts. In UML, you need at least two objects and the relationship between those objects.
 
ORM can represent n-ary relationships between objects. For example, a ternary relationship is shown as follows:
 
 
The above diagram represents a relationship between three objects, ClassRoom, Course and Time. This may be used for representing a data use case such as “ClassRoom 101 at 4 pm is used by Physics Course”. Here ClassRoom, Time and Course are entities. The predicates are “at” and “is used for”. The 101object of entity ClassRoom type at a time instance of 4 pm of type Time is used for teaching a Physics course – an instance of type Course.
 
Likewise, ORM notation allows you to represent the association between several objects and the roles they play on each other.
 
We will now revisit our earlier datasheet to deduce more facts.
 
Deducing more facts
 
From the above datasheet, one can observe that each employee is identified by a unique ID and similarly each Car is identified by a unique registration number. Each Car has some Color and each Employee has some Last Name and a First Name. An Employee drives a Car and a Car is driven by an Employee.
 
The complete ORM diagram representing the various facts is shown here. (The discussion on the complete ORM notation is beyond the scope of this article.)
 
 
The above diagram represents the following facts:
 
  • Employee(EmpNr) is an entity object type.
     
    - Every Employee is identified by one distinct EmpNr.
  • Car(RegNr) is an entity object type.
      -tr>
    - Every Car is identified by one distinct RegNr.
  • Employee drives Car / Car is driven by Employee
     
    - Each Employee drives some Car.
    - Each Car is driven by some Employee.
    - Each Employee drives at most one Car.
    - Each Car is driven by at most one Employee.
  • Employee has LastName
     
    - Each Employee has some LastName.
    - Each Employee has at most one LastName.
  • Employee has FirstName
     
    - Each Employee has some FirstName.
    - Each Employee has at most one FirstName.
  • Each Car is of some Color.
     
    - It is possible that some Car is of more than one Color and that more than one Car is of some Color.
  • Color is a value object type.
  • LastName is a value object type.
  • FirstName is a value object type.
     
    Constraints on relations
     
    While creating the above model, I have also added few constraints on each relation. These are listed below:
     
  • Between an Employee object and a Car object:
     
    - Each Employee drives exactly one Car
    - Each Car is driven by exactly one Employee
  • Between an Employee and the Last Name
     
    - Each Employee has exactly one LastName.
    - A given LastName may be of zero or more Employees.
  • Between an Employee and the First Name
     
    - Each Employee has exactly one FirstName.
    - A given FirstName may be of zero or more Employees.
  • Between a Car and its Color
     
    - Each Car may be of One or More colors.
    - A specified Color may be of Zero or More Cars.
     
    The constraints on the roles are shown by drawing the arrow-headed lines on the relationship blocks and a solid circle at the end of the relationship. The full explanation of how to draw the constraints is beyond the scope of this article; the reader is referred to the references listed at the bottom of the article for additional information.
     
    Model Validations
     
    Once a model is drawn, its correctness can be verified by inputting the real-life data. For example, we have put the constraint on the role played between an Employee object and a Car object such that each Employee drives exactly one Car and each Car is driven by exactly one Employee. We will now input some data and verify the validity of this constraint. For this, you will need Visual Studio for Enterprise Architects (VEA). VEA provides a Fact Editor for entering the facts that you want to add to the model. A screen shot of the Fact Editor is shown below:
     
     
    The Examples TAB on the top takes you to a screen where you can add data examples to verify the model.
     
    Once the model is drawn, it may be verified for its correctness by validating it against the real-life data. For example, the customer may have following data:
     
    Sameer drives Car # MH04N2002
    Ashok drives Car # MH02A1556
    Nilesh drives Car # MH02D2557
    Ramesh drives Car # MH04N1589
     
    You may create the above model in Microsoft Visio for Enterprise Architects, input the above data and validate the model. This is shown in the screen shot below:
     
     
    In the Examples editor, you add several data use cases obtained from your company’s data sheet. Once the real-life data examples are added, you can test the validity of the model by clicking on the Analyze button. This opens another screen as shown below:
     
     
    The above screen indicates that there is no error found in the tested relation. We will now add more data examples to the above table and perform the analysis one more time. We add the following fact.
     
    Sameer drives Car # MH02D2557.
     
    The result of the analysis check is shown in the screen shot below:
     
     
    Note that the result shows an error indicating that Sameer driving two different cars MH04N2002 and MH02D2557 violates the constraint that we have put on this relation. (Each Employee drives exactly one Car.) If the real-life situation is such that each Employee is allowed to drive one or more Cars, our constraint assumption is wrong and we have not modeled the user requirements correctly. One can see from this example that the model may be validated against the real-life data very easily.
     
    Verbalizing the model in natural language
     
    Microsoft VEA provides a facility to express the model in natural language. This is called verbalization. Once a model is drawn, select the role that you wish to verbalize. The screen shot below shows the role between Employee and Car objects verbalized in the windowpane below.
     
     
    This verbalization makes it easy for non-IT professional to capture errors in the model. As such errors are captured at the very early stages of the software design process, it eventually helps in creating robust software.
     
    The next important task is to map the model to a database.
     
    Mapping to Database
     
    Once a model is verified for its correctness, VEA provides an easy way for mapping the model to a database. The Studio allows you to create a database or a database schema for various types of database engines. This feature of VEA is very easy to use. Without walking you through the various steps of database creation, I will show you the final output created by VEA for the above model.
     
    The DDL script created by VEA for the above model for SQL Server is shown below:
     
    /* This SQL DDL script was generated by Microsoft Visual Studio */

    /* Driver Used : Microsoft Visual Studio - Microsoft SQL Server Driver. */

    /* Create DriverDatabase database. */
    use master

    go

    create database "DriverDatabase"

    go

    use "DriverDatabase"

    go

    /* Create new table "Car". */
    /* "Car" : Table of Car */
    /* "RegNr" : Car1 is of Car */
    /*      Role two (Car1) of fact: Car has {Car1}. */
    /*      Role two (RegNr) of fact: Car1 is identified by {RegNr}. */
    /* "Color" : Color is of Car */
    create table "Car" (
    "RegNr" char(10) not null,
    "Color" char(10) not null)

    go

    alter table "Car"
           add constraint "Car_PK" primary key ("RegNr")

    go

    /* Create new table "Employee". */
    /* "Employee" : Table of Employee */
    /* "EmpNr" : EmpNr identifies Employee */
    /* "LastName" : Employee has LastName */
    /* "RegNr" : Car1 is driven by Employee */
    /* "FirstName" : Employee has FirstName */

    create table "Employee" (
            "EmpNr" char(10) not null,
            "LastName" char(10) not null,
            "RegNr" char(10) not null,
            "FirstName" char(10) not null)

    go

    alter table "Employee"
            add constraint "Employee_PK" primary key ("EmpNr")

    go

    /* Add the remaining keys, constraints and indexes for the table "Employee". */
    create unique index "Employee_AK1" on "Employee" (
    "RegNr")


    go

    alter table "Employee" add constraint "Employee_AK1_UC1" unique (
    "RegNr")

    go

    /* Add foreign key constraints to table "Car". */
    alter table "Car"
    add constraint "Employee_Car_FK1" foreign key (
    "RegNr")
    references "Employee" (
    "RegNr")

    go


    /* This is the end of the Microsoft Visual Studio generated SQL DDL script. */

     
    Note that the two entities in our model are mapped to the two database tables, Employee and Car. The schema definition adds the appropriate primary keys on two tables, sets up the indexes and foreign key linkages between the two tables.
     
    Conclusions
     
    The ORM techniques facilitate in conceptual modeling the application software. The conceptual model deals at a more abstract level of design than a logical model and does not require you to create design classes during the very early stages of software development cycle. Microsoft provides for creation of ORM diagrams in Visual Studio for Enterprise Architect. The VEA provides facilities for verbalizing the model in natural language and also validating the model against the real-life data. This helps in refining the model at the very early stages of software development cycle. Once the model is fully tested and satisfied, it is easily mapped to any supported database. Thus, the conceptual modeling techniques along with VEA help in modeling a business requirement, testing the model against the valid data, mapping the model to a database and finally helping in producing more robust application software.
     
    References:
     
    1. Object Role Modeling – an overview by Dr. Terry Halpin,
    www.orm.net/pdf/ormwhitepaper.pdf
    2. Halpin, T.A. 2001a, Information Modeling and relational Databases, Morgan Kaufmann Publishers, San Francisco (www.mkp.com/books_catalog/catalog.asp?ISBN=1-55860-672-6).
    3. Halpin, T.A. 2001b, ‘Microsoft’s new database modeling tool: Part 1’, Journal of Conceptual Modeling, June 2001 issue
    (online at www.InConcept.com and www.orm.net).
    4. Halpin, T.A. 2001c, ‘Microsoft’s new database modeling tool: Part 2’, Journal of Conceptual Modeling, August 2001 issue
    (online at www.InConcept.com and www.orm.net).
    5. Halpin, T.A. 2001d, ‘Microsoft’s new database modeling tool: Part 3’, Journal of Conceptual Modeling, August 2001 issue
    (online at www.InConcept.com and www.orm.net).
    6. Bloesch, A.C. & Halpin, T.A. 1996, ‘ConQuer: a conceptual query language’, Proc. ER’96: 15th Int. Conf. on conceptual modeling, Springer LNCS, no. 1157, pp. 121-33 (online at www.orm.net).
    7. Bloesch, A.C. & Halpin, T.A. 1997, ‘Conceptual queries using ConQuer-II’, Proc. ER’97: 16th Int. Conf. on conceptual modeling, Springer LNCS, no. 1331, pp. 113-26 (online at www.orm.net).
    8. Halpin, T.A. 1998, ‘Conceptual Queries’, Database Newsletter, vol. 26, no. 2, ed. R.G. Ross, Database Research Group, Inc., Boston MA (March/April 1998) (online at www.orm.net).
    9. http://www.ormcentral.com/ObjectRoleModelingLinks.htm
    10. Article: Modeling, Data Semantics, and Natural Language By Ken North
    (http://www.webtechniques.com/archives/1999/07/data/)
     
     
    Poornachandra Sarang, Ph.D.
    With more than 20 years of IT experience, Dr. cializes in architecting and designing solutions based on various technologies and platforms that include Microsoft, Sun, CORBA and Open Source. An Ex-professor of University of Notre Dame, USA, Dr. Sarang conducts many training programs on state-of-the-art technologies in countries like USA, UK, Switzerland, Singapore and India. He has authored several books for WROX on .NET, Java, J2EE, Open Source and E-Commerce. Besides training, Dr. Sarang is a skilled programmer who has developed several successful products and has completed several projects. During his free time, you will find him speaking at international conferences and writing articles for reputed international journals and magazines. He may be reached at sarang@abcom.com.
     
         

    ©2008 Microsoft Corporation. All rights reserved. Contact Us |Terms of Use |Trademarks |Privacy Statement
    Microsoft