In my first post, I talked about the basics of data management what it is and the different components.  In this post, let’s speak of the elements.  The components of data management are considered vital to the overall process of any  data management initiative.

Organizations today are flooded with data, and it is not slowing down.  Identifying, classification, and collection of vital data should be considered a task for the most senior level people from each cost center or line of business. Knowing the contributing systems and process and requirements laid down through data governance.

Below are discussed the key components of Data Management as mentioned in my last post on Data Management basics.


Businesses of all kinds are experiencing a surge in data, whether it be unstructured, transactional, metadata, hierarchical or master.  Do you have the Data Management process in place to manage it, staying compliant and make sense of it.   In this high-level overview, I will discuss what MDM is, the initial elements and ending with some of the problems faced by organizations.

What is MDM

Part of Data Management is having a Master Data Management systemMaster Data Management Is the practice of using technology, processes, and tools to acquire, maintain and improve sharing of data.  The Key here is consistent identity of the business entities across multiple systems

Key takeaways of MDM or Master Data Management is to adopt a set of disciplines to improve data quality, share it, leverage it for competitive advantage, manage change, and to make sure you stay with regulatory compliance.





The common denominators to achieve a set of disciplines relies on the five following factors:

  • Accuracy
  • Completeness
  • Timely,
  • Consistency

What is Master Data?   It’s data referenced across an entire organization the diagram below shows what master data management is from a visual perspective.  It’s the relationship between master data and noun/verb relationship that exists between master data and the other forms of data.  Before I give you an example of how Master data and noun/verb relationship works.   Let’s discuss the types of data or assets. Since 80% of people are visual,  there is a diagram for your perusal.

So, let’s now discuss the five types of data in any organization or some like to call it assets.

There is Master data which is the critical aspects of a business;  people, things, places, and concepts.  Secondly Hierarchical data which usually refers to the relationships between the data elements.  The third is transactional data coming from sales, deliveries or service as noted in the diagram, invoices, trouble tickets and claims.  The fourth is unstructured data, emails, web chats, product specifications, videos, marketing collateral and lastly metadata .. data about data.

In this example; master data interacts with other data. In transactional systems, master data is almost always involved with transactional data. A customer buys a product. A vendor sells a part, and a partner delivers a crate of materials to a location. An employee is hierarchically related to their manager, who reports up to a manager (another employee). A product may be a part of multiple hierarchies describing their placement within a store. This relationship between master data and transactional data can is a noun/verb relationship. Transactional data capture the verbs, such as sale, delivery, purchase, email, and revocation; master data are the nouns. The same relationship data-warehouse facts and dimensions share.


Imagine that you have purchased a computer and expensed it to your company, you fill out the expense report and turn it in.  The Finance guy sees an unusually high expense report, so you receive a call.  If your response to him is you are the finance guy, figure it out!.  A reaction like that will not only make it harder to get your money, but he just doesn’t have the story behind why you need a computer.

The same holds true about data, your IT guys are responsible for the data but they do not create the data so when there are inconsistencies with the data between systems or doesn’t roll up well they are going to call the business cost centers.  Unfortunately, data does not come in drips and drabs such as the expense report;  they come in mass quantities of transaction and interactions inside application systems.   So, setting up a one-time call or a meeting is not going to fix the issue like the expense report issue.   Many companies end up in no man’s land.   IT does not have an enough contacts to render the data, and business centers believe it’s an IT issue, and the impact causes lots of problems.   These problems reveal themselves in some ways.

For Example:

  1. Reports do not roll up right because the data is incorrect; different systems have their customer records.
  2. Standing up new systems requires lots of rework and business rules end up being something someone puts I a departmental spreadsheet to get figures to add up
  3. Cross Selling is nearly impossible because there isn’t a consolidated profile of customers and their interactions with the company.

Fixing the issues requires much time and getting people to add something to their busy schedules, which is Data Governance.

What is Data Governance

Data Governance is a program designed to tackle the “no man’s land” issue.  By organizing all critical stakeholders involved. Unfortunately, in most companies this ends up as I like to refer to as a donut meeting.    You know the meeting that everyone goes to, to get free donuts.   For this to work you need to follow a couple of basic guidelines.

  1. IT should not own the data governance program because they don’t create the data, but they should be a participate in the meeting.
  2. IT should create the framework to enable the data governance process to be repeatable and scalable. Typically with MDM software deployments, data quality, data retirement, and security software
  3. Business Stake Holders should own the Data Governance program and here is why. Business Stakeholders are the point people when it comes to discussing all the changes to the data.  e. what it looks like, the rules and set standards.   More than likely there will be process changes that need to take place to ensure that data quality issues don’t continue.  Stakeholders need to define the logic and naming conventions to have a set standard across all applications.
  4. Some Rules to go by
    1. Have official sponsorship in place to take care of tie-breaking
    2. Rolling 6 to 12 month plan to track progress
    3. Clearly defined Roles
  5. Data Governance should satisfy the following
    1. Regulatory requirements
    2. Ensure business continuity
    3. Drive precise search and retrieval
  6. The three key components of Data Governance are
    1. Backup
    2. Archiving
    3. eDiscovery

The idea of Data quality rests with the organization putting its data governance policy in place, and it is the fitness of the data to serve its current context.

Data Quality Includes;

  • Accuracy
  • Completeness
  • Update status
  • Relevance
  • Consistency across data sources
  • Reliability
  • Appropriate presentation
  • Accessibility

Data Quality is critical to operational and transaction processes.  Business Analytic, Business Intelligence depend on the quality.  Three factors determine quality;  the process of input, storage and management.  Verifying the reliability and effectiveness of the data is called Data Quality Assurance. (DQA)

One of the hardest aspects of maintaining data quality is duplication.  Scrubbing is the process of updating, standardizing and making sure duplication does not happen.  The purpose is to maintain a standard and create a single view of the data.  In prominent organizations, same records and data may be stored in many different disparate systems.


If you want to know how a business operates just look at the financials.  The success of any business solution is the data.  Data integration is combining data from multiple disparate sources usually stored using a variety of technologies and provide a unified view of that data.   Mergers and acquisitions are one example of a data integration project.  It means taking the combined assets of two companies and creating a unified view of the data assets.

Another Example of data integration is building Enterprise Data Warehouse or EDW.  The benefit of a data warehouse it enables a business to perform the analysis.

Areas to consider

Data integration covers several areas

  • EDW
  • Migration of Data
  • Application/information integration
  • Master Data Management

Data Federation

Data  Federation Is a process that combines heterogeneous data into a single unit.  Aggregating data from disparate sources into a virtual database.  Once federated, in a consistent format, it is ready for business intelligence BI or other analysis.

What is heterogeneous data?

In Computer Science:

Heterogeneous = different nature in the same data set.
When I handled a major retailer, they had transactions from online purchases and transactions from brick and mortar stores.  Let’s just say at the end of the year they wanted to create a single report of customers that made online and instore purchases.  To develop this report they need to integrate the disparate data systems.

They would create a report using a third party operational data stores like speed commerce, or ODS and ETL processes.  Just speaking they never did this but if they were to use a federation server they can virtually see one view of the transactions, a 360 view of their customers.

This technique is especially useful if some of an organization’s data are stored offsite by a third-party cloud service provider, which it was.  By accessing data virtually, the chain does not have to move data, replicate it or retrieve it from tables to perform the analysis. It would allow the statistician, to aggregate and organize data quickly without having to request synchronization logic or copy the data until it is necessary.