If you are an ERwin Data Modeler user, you are familiar with CHAR(18). This is the default datatype set by the software. As a novice ERwin modeler, I felt this was a strange value to set as a default. I saw defaults as being the most prevalent value and the value assigned when no value was entered. As my use of ERwin matured, I realized that CHAR(18) serves a different purpose in the data modeling world.
CHAR(18) is a red flag. It signals that the data modeler has not assigned a datatype to an attribute or column. If the default value were CHAR(1), CHAR(10) or CHAR(50), it would not be the red flag. These values are much more common. I cannot recall a single attribute with a length of eighteen characters in my lengthy IT career. Pretty darn clever of those ERwin developers years ago, isn’t it?
This week I have been thinking about the completeness of the data model. Most modelers have standards and procedures to assure that the data model contains the information needed to support the functionality of the target application. The completeness I have been pondering is how a modeler knows that their model is good enough for prime time. In other words, what are their red flags?
I go through the normal data modeling steps of making sure my model is understandable and as simplistic as possible; is implementable and flexible; and integrates with existing data structures and future data strategies. I do have my show stopper red flags that assure my model is ready for the masses.
- CHAR(18) datatypes – Catch this error early before they are manifested in the database where a variety of data and application issues surface. Datatype discussions yield a good bit of insight into the entry, display, storage and transformation of data.
- Empty descriptions – Missing ones cause confusion in the data dictionary and breeds misunderstanding. Data model descriptions are more visible in today’s metadata, reporting and development tools.
- Missing relationship phrases – The data model needs to tell the story of data. How can this happen if the business rules are missing on the diagram? Part of good data modeling practices is teaching team members how to read a data model.
- Non-standard abbreviations – ERwin and most data modeling tools can automate the abbreviation of names. The process is not foolproof. Make sure the abbreviated names match the style and standards of the target database and DBMS.
- Mismatched logical and physical datatypes – Despite thinking I mastered ERwin after 18 years, I inevitably have a handful of mismatched datatypes when I model. I make sure the physical model matches the logical datatype or the agreed upon physically implemented datatype.
These five show stopper red flags are not the only things I look for when evaluating the completeness of my data models. They are the important things that I need to resolve before I prime time my model. They are five things that people seem to tune into if you do not verify them. By addressing them before you unveil your model, you eliminate distracting discussions on these overlooked design missteps and focus on the discussion of the subject and content of the model.
I am presenting at Data Modeling Zone 2014 in Portland, Oregon. I hope that you will join me in my sessions: Is your data model a work of art? and Relationship versatility and the data modeler #DMzone