I am a proponent of a full lifecycle of data modeling. Data modeling takes on many forms based on how the value an enterprise puts on the data model. A pattern I have witnessed over the years is only creating physical data models to document the physical database implementation. Logical modeling may be non-existent due to a variety of reasons. Here is my list of the common arguments I hear when talking about physical data models and my responses to those comments.
1. You just need to reverse engineer the database to get the physical data model.
Reverse engineering is a handy feature of most data modeling tools. It visually presents a picture of database structures. It is a good starting point for data modelers when restructuring a database. It captures the reality of the physical implementation of data.
Some IT shops solely use data modeling tools to reverse engineer databases. This methodology misses the finer points of physical data modeling. The transformation from logical to physical data model links the business model to the technical audience centered model. It allows the enterprise to apply and enforce corporate standards in the modeling tool. Indexes, referential integrity, constraints and DBMS specific parameters are optimally captured when driven by the data model. Generation of the DDL is a time saver for the DBA.
2. You just need to reverse engineer the physical data model to get the logical data model.
Data modeling tools have become more sophisticated over the years. I can attest to the fact that ERwin possesses features that can transform the techie codified database names into a presentable business oriented name. It’s a nice start but misses the mark of being a true logical model.
A logical model is more than an Entity Relationship Diagram (ERD). Sure, the diagram is the most visible and recognizable deliverable data modelers create. Reverse engineering shortcuts the primary benefits of logical data modeling. Business data requirements are transformed into entities though dialogues between the business and IT. Business rules become relationships in this process. Attributes are identified to meet the business needs. This collaborative effort creates a logical data model that tells the story of the business data in a graphical manner.
3. It’s a picture and not the real database.
Distrust of the physical data model is common in the IT world. As data modelers, we are most certainly the sole culprits that create this mistrust. The physical data model is an excellent tool for developers, testers, and power business users to understand how data is defines and interrelated. I have yet to meet a database user who would not like an accurate snapshot of their data.
Accuracy is where distrust raises its ugly head. A physical data model must accurately reflect the physical implementation of the data. This involves a close working relationship between the physical data modeler and DBA. A full lifecycle approach needs to be embraced to assure this accuracy. Recognizing that physical data modeling is not just a one-time, create-a-picture task but an ongoing process is key. Data modeling practices should include a data model to database synchronization process to close the inaccuracy gap and assure an accurate physical data model.
4. It’s like putting the chicken (model) before the egg (code).
Where does data modeling fit into the system development lifecycle? This has been hotly debated over the years. The growth of agile development makes this question even more volatile. Many software developers see the data model as an organic object that results from developer prototyping and user experience design. Data professionals tend to take a more strategic approach of identify the business needs up-front and model them to meet the enterprise needs.
Data modeling is a collaborative effort. Agile methods are based heavily in this team approach to development. I would argue that data modelers today in just about any development environment need to be embracing this highly collaborative approach to database design.
In my methodology, the data model gets a slight start before the developers begin to design and code. Once the project is in full gear, data modelers model and developers design in tandem. Models are turned around faster to fit into shorter sprints. The chicken and egg argument disappears when the team works jointly to create artifacts and deliverables that add value to the project.
5. It’s a map and you don’t need definitions.
As I stated earlier, data modeling is more than an Entity Relationship Diagram (ERD). Reverse engineering creates that picture but very seldom is able to create the underlying metadata. A data model without definitions does not do the data modeling process justice.
Definitions are the core pieces of data model metadata. They remove the ambiguity of the data model object. Metadata puts the object into the context of how the business sees and uses the object. In today’s world, data modelers need to be conscious of where data model metadata shows up beyond the ERD. Data dictionaries, business glossaries, metadata repositories, ETL, and reporting tools are becoming more tightly integrated with data modeling tools. With this integration comes the sharing and visibility of the data model metadata.