Poor data quality costs over $600 billion annually as per TDWI surveys and studies.
Whether you are the VP looking over your latest sales, the secretary compiling a mailing list, the data quality analyst rummaging through data sets, the business analyst working on a data integration project, or an accountant going over the projected budget.
Whether you are in a large Fortune 500 company, in the government, or a small community association you will benefit.
Duplicate records of customers cause discontented customers and multiple mail-outs. How is it possible to have duplicate records?
-- Records are manually entered twice;
-- Processes create record twice;
-- Participants registered under multiple names;
-- Participants registered at multiple locals.
So before you send out a mailing list to your community members, or potential program participants for marketing campaigns or program sign-ups for this year’s sports, arts, sales, and/or membership drive seasons:
-- 1. Sort that list by name. Why because you may have the child on the list twice, or three times; after sorting by name;
-- 2. Sort by address, you may have the same family household multiple times because one the parents registered under their name, and/or siblings are registered with your organization as well.
Don’t know how to sort…here’s a good way to start if you’re using Microsoft Excel…highlight your records and click on this little icon , it’s easy as breaking eggs.
New SQL type programming here’s a simple query that will identify duplicate records.
SELECT attribute1, attribute2, attribute3, attributen… count(*)
GROUP BY attribute1, attribute2, attribute3, attributen…
HAVING (COUNT(*) > 1)
I hope this helps anyone working with customer lists..