Five meaningful Steps to Quality Data Modelling

Five meaningful Steps to Quality Data Modelling

Many business people consider data modelling to be a black art practiced by the enterprise IT department that brings no tangible business benefits and is solely designed to make insignificant mortals feel confused and inferior.

Sadly, many IT departments truly foster this view and, for those that do, the data modelling that they carry out fails to deliver any real benefits – in spite of of the mumbo jumbo they chant about it. It does not have to be this way.

When done properly, data modelling can deliver enormous business benefits to any enterprise, that include:

  • Higher quality information for all business activities
  • Easier access to that information
  • strong information systems
  • Better identification of products, profit and cost centres.
  • Elimination of redundant and unneeded information
  • Reduced costs and increased revenues

How do you do data examination and modelling “properly”? Where do you start?

The Following two simple (though fundamental) rules will guide you on your way.

Rule 1: Use exactly the same supplies from which you extracted your Business roles from to extract all information for data modelling.

Rule 2: Only form data that is needed to directly sustain the Business roles of the enterprise.

Starting with Rule 1 will ensure that you conform to Rule 2.

The Integrated Modelling Method provides a foolproof technique for extracting candidate entities, attributes and relationships from the supplies from which the Business roles were extracted. This technique can be used by novice and experienced analysts alike.

These supplies include:

  • Transcripts of taped examination interviews with senior business managers.
  • Typed up notes of supplementary information from these interviews.
  • Function titles and descriptions developed during function modelling.
  • Information flow diagrams produced in examination workshops.


The basic technique:

Step 1 -Work by your data supplies (best to have these in electronic format) looking for and underlining all “noun structures”, as these are “candidate” entities.

Step 2 – Extract all of these candidate entities, and the associations between them, into a separate document.

Step 3 – transform these candidate entities and associations to actual entities, attributes and relationships.

Step 4 – Build an Entity Relationship Diagram (ERD).

Step 5 – Design any required relational databases from the ERD.


The first steps in the technique are best demonstrated by way of an example.

The following is part of a transcript of an interview with a business manager, with all of the nouns underlined.

“We receive orders for our products from our customers the day before they require delivery. We check the quantity of the raw materials required to bake the products and, if necessary, we order more from our suppliers. We bake our products fresh each morning. We make deliveries to our customers several times each day. At the end of each week we invoice each customer for the deliveries made to them during the week. We accept payment or remittance from customers by cash and cheque only”.

All of the noun structures have been underlined.

The first sentence is:

“We receive orders for our products from our customers the day* before they require delivery.

Working by the sentence one underlined noun at a time we get the following list of candidate entities and associations:

order [means of ordering] product

product [ordered by means of] order

order [received from] customer

customer [the source of] order

product [delivered by means of] delivery

delivery [means of delivering] product

customer [recipient of] delivery

delivery [made to] customer

observe: day* is an attribute of order, probably “date”.

Because every association is two-way, when we document one association we closest create its reverse.

Working by the whole of the above transcript gives us the flowing complete list (sorted alphabetically):

baking [to produce] product

customer [billed by means of] invoice

customer [recipient of] delivery

customer [source of] payment

customer [the source of] order

delivery [made to] customer

delivery [means of delivering] product

delivery [of products billed on] invoice tbv

invoice [a billing for] product

invoice [a means of billing] customer

invoice [billing for goods delivered by] delivery tbv

invoicing period

order [means of ordering] product

order [means of replenishing] raw material

order [placed with] supplier

order [received from] customer

payment [accepted from] customer

payment [made by] payment method

payment method [valid means of making] payment

product [billed for on] invoice

product [delivered by means of] delivery

product [ordered by means of] order

product [produced by] baking

product [requirement for] raw material

raw material [quantified by means of] stock check

raw material [replenished by means of] order

raw material [required to bake] product

raw material [sourced from] supplier

stock check [to establish quantity of] raw material

supplier [recipient of] order

supplier [the source of] raw material

This short extract has given us eleven rare candidate entities and thirty (15 x 2) candidate associations.

Rationalising Entities

The list of candidate entities needs a little more work to remove false or spurious entities. A typical example of a candidate item that is not a proper entity is “invoice”. An invoice is probably the most shared business item to be incorrectly modelled as an entity. The invoice itself is a piece of paper that represents a business entity or a collection of entities such as a sale (of one or more products) or a billing (for one or more sales). These are the actual data entities that should be modelled – not the pieces of paper representing them.

Converting Associations to Relationships

The associations that we identified must now also be rationalised and converted to “Relationships”. Associations simply tell us that two entities are associated and give us a suggested name for that association. A relationship tells us all of the basic information that we need to know about the association. This includes

  • the precise name of the relationship
  • whether it is mandatory or optional
  • its “degree”, that is, if the relationship is one-to-one, one-to-many or many-to-many


Relationships must be capable of being read in the following way:

Each Order must be received from one and only one Customer

Each Customer may be the the source of one or more Orders
Relationships are always two way so there must always be two entries to specify what they are in both directions.

The items in Bold above are the entity names.

The underlined items show the optionality. Mandatory relationships are written as must be, optional ones as may be.

The items in italics are the relationship names. These must be named so that they can be preceded by “Must be” or “May be”.

The terms “one and only one” and “one or more” define the degree of the relationship.

The Entity Relationship Diagram

All of the preceding information is basic to know but is almost impossible to visualise and of limited use without constructing an Entity Relationship Diagram (ERD). This is the single most powerful form for use in the understanding of the data structure of any enterprise and is an basic component in the design of quality databases.

Effective Layout

On an ERD in the Integrated Modelling Method the “many” ends of a relationship are indicated by a symbol, which resembles, and is called a “crows foot”. If this symbol is turned upside down we get a “dead crow”. This gives rise to one of the most powerful, however simple, rules for achieving a really effective layout for any ERD, which is “Dead Crows Fly East”.

The net consequence of this layout is that all high quantity and volatile entities will appear to the top and to the left of the ERD and all low quantity and more continued entities will appear to the bottom and to the right.

leave your comment