Friday, December 4, 2009

Some TLC for Your Data

Did you ever wonder why the data error was entered into the system, database, or report that’s right there in front of you?

You can look at it a hundred different ways.

-- The customer entered the data wrong on the website.
-- The call centre rep, entered the data wrong on the website.
-- The sales rep, forgot to enter his sales for the month of September and keyed it in for October.
-- The programmer entered the wrong statement in the data integration script.
-- The programmer put ‘greater than’ instead of ‘less than’ statement in the summarization script.
-- The business analyst did not provide the correct data retention requirements, and that’s why you have 6 months of summarized data vs. 16 months.
-- No one could come to a consensus on the definition of the value, that’s why we have 187 values for that field.

These are just a few reasons bad data is where it is.

What I’d like to know is what are your reasons, or the reasons you’ve heard. Feel free to let me know, send me a list of reasons why bad data existed in your system, application, database or report. I don’t want high level reasons, let’s have the granular reasons. When I get to 101 I’ll publish the list for all to see (no names or course).

However, here’s another reason to think about it. It’s apathy.

Really, really.

To have good, quality, accurate data all you need is a little TLC. For data to be accurate people, need to care just a little more about what they are doing. In the above examples, if people gave a little TLC there would be no bad data.

We live in a rushed, hurried world where everything is needed yesterday so a little TLC is hard to come by.

1 comment:

  1. Great question.... great examples....

    One of the major reasons in data quality (or perceived data quality issues) is that many a times when applications are designed, not all requirements are considered. Mainly requirements around Analytics are not considered upfront. This causes issues further down the line.
    Here is one detailed example as you requested…. One of the clients I worked with had SFA system implemented. They had competitor field defined on Opportunity (because there would be competitor in every deal). Well this field was left optional. Obviously sales reps did not enter this information most of the time because system allowed them to save Opportunity records without forcing them to enter competitor information.
    Customer started feeling pain point when competitive pressure built up and most of the opportunities were not tracking competitors. Secondly…. They realized that there are many competitors in a deal and wanted to get visibility into all competitors.
    Again this data quality issue got highlighted as the competitive analysis was done as a result of heavy discounting and reduced win rates/conversion rates. Obviously not having competitor being tracked is perceived as data quality issue which was highlighted by Analytics. IS this data quality issue? Or is this more of a process issue? Many process issues surface as data quality issues.

    Vish Agashe