What is Data Governance?





"What is Data Governance?", a curious kid asks, peeking above my shoulder into the laptop screen. He is 14 and frequently asks questions with no interest in knowing the answer. Just like many other people around me.

"That's a great question", - the first thing you would say when you have no good strategy for how to approach the question. The second part is to think aloud.

After a few minutes of gathering my thoughts: "consider the term "data" as a  synonym of "useful information". We use the information to support decision-making and choosing strategy.  

Regardless of whether we are talking about a household or business, having a proper strategy ensures efficient business management and somewhat helps to forecast the future.

Data Governance is a system that controls every aspect of the data lifecycle - the series of stages the data goes through, from being captured, stored and used, to data asset destruction. This system helps to ensure and sustain:

  • Data Quality: is the ability to measure the accuracy of the data, its completeness, reliability and relevance and the ability to track errors and count data issues found. For example, how many times I saw bread crumbs in the bed linens regardless of being assured that you haven't eaten a sandwich in bed. The difference between expectations of your grades to actual grades. 
  • Data Catalog and Lineage: clear data categorization, organization and relationship between data assets. Ability to understand data asset owner and purpose, attributes' data types and purpose and visualize data changes and ability to view the context in which data assets existed at any given point in time. For example, looking at the pancakes in the garbage bin and visualising eggs linage that starts from the raw eggs to being beaten with the yoghurt, cooked on a pan, transferred to the lunch box and figuring out how they ended up in the garbage bin the next morning - that would be an Eggs data lineage. 
  • Data Accessibility, Compliance and Security: processes that we establish to support data access, that help to manage and secure data resources. In order to change the time allowed for you to play games on the phone should be secured with a fingerprint to avoid security breaches or comply with the rule of light off at 10pm.
  • Data Observability: monitoring the health of important systems in your organization, processes, data pipelines, APIs etc., gathering logs, and defining KPIs to measure strategy success. We can have metrics that help us to reduce costs and manage risks.  I have a clear maximum threshold for the number of hoodies that were lost this year in school. I also calculate how many packed uneaten sandwiches I have found under your bed and the average time that took to grow such a fluffy spectacular green mold on them.  

Naturally, when I was done with my speech, I figured out that the kid had disappeared out of my sight. The only audience that I had was my dog. Animals are the best listeners and would appreciate anything you have to say. 

Dear audience, if you haven't yet tuned off, will be happy to hear your take on Data Governance and if there is an additional category that is missing from my list.


Comments

Popular posts from this blog

SQL Awesomeness: Finding a QUALIFY query clause in the depths of the database ocean

Look back and realize how far you came

The Greatest Reasons to use or not to use a Centralized Data Access Architecture