Each of us has parts of our job that we like more than other job duties. Metadata happens to top my list of likes. I see myself as a pretty thorough documenter of model objects. Definitions are the heart of the data architect’s metadata world. I outlined the anatomy of a definition in a blog post about a year ago.
Most data shops have a pretty thorough set of data standards and guidelines that encompasses the anatomy of a description. They are the black and white rules by which a definition can be deemed acceptable or not. Just as important is another set of softer guidelines that make an excellent definition.
Target new and unexpected users.
Business subject matter experts write good definitions and are the source of my definitions whenever possible. They live with the data every day and write from the context of how it is used. This is an excellent perspective but needs to be tempered with language that a more universal audience can comprehend.
Today’s technology exposes data to a wider audience; many unfamiliar with the term. Accurate analytics relies on understanding data and use it in the correct context. It is increasingly common to share data outside of the enterprise where the term can be understood differently. The definition author needs to craft a definition to be understood by these diverse audiences.
Recognizing consistency or inconsistency.
I have never lived in the utopian data world where an element carried the same name and definition through the enterprise. Applications were inevitably built over years in silos often by outside vendors. The result is a collection of seemingly similar terms with differing meanings and usage.
Today, users easily connect to most data assets and add them to reports and applications. Data architects need to facilitate the standardization of names and definitions noting exceptions to the standard. Ideally, different objects should not carry the same name. Definitions should spell out the context, formulas or calculations. We bear the responsibility for maintaining consistency and noting where inconsistency exists.
Rightsizing the definition.
It is not permissible to have a data asset with no definition. I will repeat that. It is not permissible to have a data asset with no definition. I have seen many excuses for exceptions to standards for missing-in-action definitions. Make the crafting of definitions as important as crafting a readable data model. It must be a key part of your governance. No definition translates to no understanding or misunderstanding.
Thorough definitions are a necessity as is brevity. Technology is delivering new consumers to your data. It is critical they read the definitions. They are more likely to read a briefer definition. Craft a definition that conveys the meaning of the data asset in as few words as possible without compromising your standards.
Maintain the integrity of the definition.
The path of a data element through the enterprise is a complex animal to follow. Data movement, replication, transformation and reporting tools add more variability to data lineage. As the data comes to rest at each point, there is generally the ability to add metadata including a definition.
As data moves farther out of the data group’s control, it is more likely to escape governance at these resting points. Tool administrators can often define data objects; as can many end users. Data architects need to educate and assert the importance of maintaining the integrity of metadata as the data moves. It is not permissible to redefine data without the involvement of data governance.