Tuesday, March 2, 2010

A gold DQ team!


With the Olympic Games come and gone, and the NHL hockey season fast approaching the playoffs I can not help but take a moment, although brief, to state that a golden data quality team should resemble a golden hockey team.


Now I’m not going to say who‘s perfect in hockey, I’m not here to wave flags. It’s the skills that count. In data quality you want people who have all these skills:


The offense:
-- Put the pressure on the other team (the delivery team, the source team) to get what they want – a winning situation. They proactively look for the bad data, and get rid of it and create plans for cleaning the data or preventing the situation from arising again.
--You need people with technical skills, people who can read and analyse data models with ease.
-- The offense will come back and support the defense to help them get that bad data out, based on their knowledge and skills.
-- They have the ability to set the stage for an opportunity for improvement.

The defense:
-- The defenders on a data quality team, see the big picture.
-- They know what’s coming down the pipe before the others do.
-- They drive bad data to the corners to prevent it from getting in.
-- They keep the bad data out by supporting the offense.

The netminder:
-- If the defenders see the big picture, the netminder sees the wall it’s hanging on.
-- The netminder’s job is solely to ensure bad data does not get in. If it does, then you will be basing some bad decisions based on their mistakes.
-- Has quick reflexes, and understands exactly how to prevent the situation from occurring.
Each and every member on your team will work together to support each other, they will have skills from each category. As a team of data quality analysts, they must ensure they keep your reports, databases clean so your decision makers can make sound quality choices for your organization. Now that's golden.

Friday, January 29, 2010

Coaching Data Quality to Skate on Ice


Learning to skate is much like learning data quality.

It holds great rewards when done right, and on the other hand it can be cold and unforgiving.

When you take a new skater you give them a helmet and take them by the hand, lead them onto the ice.

-- A new analyst, you take them in, show them around, show them the work and the data, the data model and the business. You are there to help. You tell them if they have any questions to stop by.


You teach them how to bend at the knees, and push off with their legs, first one side and then the next.

-- You show them wher to go in the data architecture to find problems and correct them. You show them that even if a user is saying it's wrong, it may very well be that the user does not understand the data at all and it is correct. it's all about positioning.

You coach them how to stop, else they will keep on going into the boards.

-- You coach them that they must stop. They must stop and look, and always assess what is the bets return to the business.


You teach them balance and composure. You give self-confidence so that they feel the cool air on their face and have the ability to take a step forward and skate on their own, and yes they will fall.

--They will succeed and yes they will fail. However, with the proper coaching they will not give up and they will push on for the betterment of your organization.

Friday, December 4, 2009

Some TLC for Your Data

Did you ever wonder why the data error was entered into the system, database, or report that’s right there in front of you?

You can look at it a hundred different ways.

-- The customer entered the data wrong on the website.
-- The call centre rep, entered the data wrong on the website.
-- The sales rep, forgot to enter his sales for the month of September and keyed it in for October.
-- The programmer entered the wrong statement in the data integration script.
-- The programmer put ‘greater than’ instead of ‘less than’ statement in the summarization script.
-- The business analyst did not provide the correct data retention requirements, and that’s why you have 6 months of summarized data vs. 16 months.
-- No one could come to a consensus on the definition of the value, that’s why we have 187 values for that field.

These are just a few reasons bad data is where it is.


What I’d like to know is what are your reasons, or the reasons you’ve heard. Feel free to let me know, send me a list of reasons why bad data existed in your system, application, database or report. I don’t want high level reasons, let’s have the granular reasons. When I get to 101 I’ll publish the list for all to see (no names or course).

However, here’s another reason to think about it. It’s apathy.

Really, really.

To have good, quality, accurate data all you need is a little TLC. For data to be accurate people, need to care just a little more about what they are doing. In the above examples, if people gave a little TLC there would be no bad data.

We live in a rushed, hurried world where everything is needed yesterday so a little TLC is hard to come by.

Thursday, November 12, 2009

Book Review: Viral Data in SOA


“Virus: A microorganism smaller than a bacteria, which cannot grow or reproduce apart from a living cell. A virus invades living cells and uses their chemical machinery to keep itself alive and to replicate itself. It may reproduce with fidelity or with errors (mutations)-this ability to mutate is responsible for the ability of some viruses to change slightly in each infected person, making treatment more difficult.” Medicine.net

In the early stages of the flu season, it is only appropriate we take a quick look at viral infections.

With discussions about service oriented architecture, concerns about data quality, and data management will become highlighted to any organization. As bad data infects one portion it will easily flow through to other modules, databases, process flows, reports and decision points of your company. One must be vigilante in monitoring data, and managing it.

What did I like about this book, everything. What didn’t I like about I, not much.

“Service –Oriented architectures are intended to encourage solution builders to create offerings that can readily transcend point-in-time solutions.”

The author, Neal A. Fishman, talks about data governance and the critical role communication plays. He identifies that data governance can also be handled in a proactive and reactive manner. He identifies what needs to be done to enforce data governance in a SOA environment, and how the control points can govern data quality. Those points being:

-- 1. Ensure: Controls for operating
-- 2. Assure : Controls for performing
-- 3. Insure: Controls for sustaining
-- 4. Reassure: Controls for continuity

He describes data quality and data governance in great detail within the SOA environment and the author states:

“The effectiveness of data governance depends on how the governance body reacts and adapts to the cultural environment.”

With that the author describes the dialing system to tweak operations. ED-SODA provides the dimensions needed to adjust the data governance process. It can be used for virtually any culture if not all.

And if you are having issues with building a data governance model, step into the reference model, for this is where you will get your basics for developing controls in data governance.

Even though the book is about data in a SOA environment, this is a book for every data analyst particularly the sections on data quality, data governance, and a myriad of thought provoking points throughout the book. and how bad data can become viral. These points and examples of what others have done will provide insight into your own issues and processes.