Data Quality

Enterprise data quality  

is the practice of ensuring that your organization's data is accurate, complete, consistent, and timely.

Good data quality leads to better decision-making, improved efficiency, and reduced costs.

Bad data quality leads too poor decisions, inappropriate actions, inefficient processes with lots of rework, unhappy customers and wasted resources

 

What is Enterprise Data Quality?

Enterprise Data Quality is all about making sure that your data is good enough to support your business. This means that it should be accurate, complete, consistent, and timely.

  • Accuracy: Your data should be correct and up-to-date.
  • Completeness: Your data should have all of the information that you need.
  • Consistency: Your data should be formatted and stored in a consistent way across all of your systems.
  • Timeliness: Your data should be available when you need it.

Why is Enterprise Data Quality important?

Typically the documentation on the importance of Enterprise Data Quality tends to focus on analytic (even of the most simplistic kind) and it's true, data quality is important because it helps you to make better decisions. If your data is bad, then your decisions will be bad too. For example, if your customer data is inaccurate, then you won't be able to target them with the right marketing campaigns.

 

However having good quality data is more fundamental than that.  If you have poor quality data in your operations then your operations will be poor. If part of your business is fulfilling orders for customers and their order data is dodgy and/or your customer info is not great then you will be sending the wrong product to the wrong people with all the rework, anger and loss of reputation that that entails.

 

How Do You Go About It

Data quality comes with a cost, though in fact the upfront cost can quickly be recouped if it's done in the most effective manner, As with most data initiatives the key is to focus on the key data elements. 

Find the key data elements, these will be the same as the ones that are the focus of Data Governance and there is a massive overlap between the two pillars. 

The data stewards can observe data issues. It should be helped with reports and and alerts,. 

Technology can help, data profiling can provide some surprising insights but generally - in my experience - the people on the ground know the dodgy-data, in fact, more importantly they know the important data that is also dodgy.

 

Once you have identified your key data and any issues around it, it is essential to create feedback loops to get it fixed.

This is simpler than it sounds; the idea is that 

  1. A problem arises in the data
  2. It is identified by a data steward
  3. We know the lineage and ownership of the data (see data governance)
  4. We raise the problem with the owner or their delegate
  5. They change the data, and possibly the process so the data is fixed at source by the people who know the most about it.

Quality problem solved.