Business Intelligence Best Practices -

Collaboration. Communication. Community.

 E-mail to friend
Basics of Governance and Data Integration, Part II of III Governance and Data Governance

by Dan E. Linstedt
Data governance encompasses the people, corporate processes and procedures that ensure data value, quality improvement, single shared definitions and availability at the right time to the right people.

In the first article of this series, I introduced governance, country governance and IT governance and began to address how to succeed with IT governance. In this article, I will explore data governance along the same lines.

Data Governance
Unfortunately, data governance is not as clear-cut as IT governance. A definition does not appear to exist for the SEI/CMM paradigm of monitoring, measuring and controlling the data, let alone access points, information (meaning) and value of the data (treating data as an asset). A search for data governance on the Web finds many companies claiming to have data governance, or claiming that they have implemented data governance, but only a few have provided a description of what data governance means to them. They are:

“The Armonk, N.Y.-based technology giant describes data governance as the sum of information security, privacy and compliance, and to make real-world sense of it the company has formed the Data Governance Council, said Steve Adler, the council's chairman and a program director for IBM Data Governance Solutions.” (Source: IBM Forms council for Data Governance, Information Week,

 “Data Governance is the process by which companies govern appropriate access to their critical data, by measuring and mitigating operational and security risks associated with access to data.” (Source: IBM Security and Report: Data Governance Council,

“Data governance is the practice of making enterprise-wide decisions regarding an organization’s information holdings.” (Source: Data Management Study, Federal Enterprise Architecture Program Management Office,

“Therefore, data governance is really about building compliance requirements and associated exposures into the continuity of business operations. Organizations also should recognize that building an infrastructure that ensures compliance and business continuity is a long-term process and not an overnight occurrence.” (Source: Hitachi Data Systems White Paper, Data Governance: Regulatory Compliance and Business Continuity by Dennis Wenk and Christophe Bertrand,

Data Governance Defined
If I were to construct a definition for data governance based on what I’ve read and learned over the years, I would define it as follows:

Data governance is the people involved in corporate processes and procedures that ensure data value (alignment), quality improvement (information), single shared (accepted) definitions, and availability at the right time to the right people.

Please don’t confuse data governance with data management. Data management is the management of: data, access points to that data and management of its metadata (or definitional meaning). Data management is part of the role of data governance, but the process of data governance is to exercise control over the data within a corporate alignment. Data, in this context, is any information captured within a computerized system, which can be represented in graphical, text or speech form.

Data governance runs horizontal to the entire enterprise. Data is everywhere, and access to data is generally neither monitored, nor measured. Consistent definitions of the data and how to use it have not been formulated (except under master data management and data stewardship efforts). Master data management is not data governance; it is data management. However, establishing individuals to oversee the administration of data processes and integration into the enterprise is data governance.

What is Subject to Governance?
It is too easy to become esoteric and discuss what “data” means. For instance, is a business process data or not? This is what makes differentiating between data governance and data management so difficult. Let’s take a lesson from SEI/CMM.  All documents will be versioned and history of changes will be kept, all processes will be formalized and documented. All new efforts will be quantified with risk assessments and “ability to execute” scores. All changes will be requested in writing, and so on. The person involved in the process of administering the document check-in/check-out and versioning is exercising data governance.

Roles and Responsibilities
Data stewardship should be a part of every enterprise project. The data steward is responsible for setting up, designing and controlling metadata, data definitions, data interpretations, quality of data and the usage of the information. The data steward is responsible for data governance as an enforceable and measurable component.

Data management plays a large role in the execution of both data governance and data stewardship. Understanding the data, working with the nature of the information and creating an environment that lends itself to audits and compliance are all part of managing the data. Data governance is also about metadata. Establishing metadata governance practices should be a part of any good project plan.

Data Governance is a Process
Data governance is not clearly defined by the industry; however, it is utilized across many different companies and even advertised by vendors as a “feature.” Data governance is not a feature. It is a process – a process by which we control the access and security of the information we own and manage. Data governance must include metadata, unstructured data, registries and ontologies and is a big part of repeatable and compliant success.

Data governance will hopefully become more clearly defined as we move forward into the execution of compliance initiatives such as Sarbanes-Oxley and HIPPA. It will be interesting to see how data governance plays a role in the unstructured data land and where it is applied in SOA and Web services.

In my next article, I will explore SOA governance and discuss the nature of governing service-oriented architecture components.  

Recent articles by Dan E. Linstedt

Dan E. Linstedt -

Cofounder of Genesee Academy, RapidACE, and, Daniel Linstedt is an internationally known expert in data warehousing, business intelligence, analytics, very large data warehousing (VLDW), OLTP and performance and tuning. He has been the lead technical architect on enterprise-wide data warehouse projects and refinements for many Fortune 500 companies. Linstedt is an instructor of The Data Warehousing Institute and a featured speaker at industry events. He is a Certified DW2.0 Architect. He has worked with companies including: IBM, Informatica, Ipedo, X-Aware, Netezza, Microsoft, Oracle, Silver Creek Systems, and Teradata.  He is trained in SEI / CMMi Level 5, and is the inventor of The Matrix Methodology, and the Data Vault Data modeling architecture. He has built expert training courses, and trained hundreds of industry professionals, and is the voice of Bill Inmons' Blog on