Thursday, November 12, 2009

Book Review: Viral Data in SOA


“Virus: A microorganism smaller than a bacteria, which cannot grow or reproduce apart from a living cell. A virus invades living cells and uses their chemical machinery to keep itself alive and to replicate itself. It may reproduce with fidelity or with errors (mutations)-this ability to mutate is responsible for the ability of some viruses to change slightly in each infected person, making treatment more difficult.” Medicine.net

In the early stages of the flu season, it is only appropriate we take a quick look at viral infections.

With discussions about service oriented architecture, concerns about data quality, and data management will become highlighted to any organization. As bad data infects one portion it will easily flow through to other modules, databases, process flows, reports and decision points of your company. One must be vigilante in monitoring data, and managing it.

What did I like about this book, everything. What didn’t I like about I, not much.

“Service –Oriented architectures are intended to encourage solution builders to create offerings that can readily transcend point-in-time solutions.”

The author, Neal A. Fishman, talks about data governance and the critical role communication plays. He identifies that data governance can also be handled in a proactive and reactive manner. He identifies what needs to be done to enforce data governance in a SOA environment, and how the control points can govern data quality. Those points being:

-- 1. Ensure: Controls for operating
-- 2. Assure : Controls for performing
-- 3. Insure: Controls for sustaining
-- 4. Reassure: Controls for continuity

He describes data quality and data governance in great detail within the SOA environment and the author states:

“The effectiveness of data governance depends on how the governance body reacts and adapts to the cultural environment.”

With that the author describes the dialing system to tweak operations. ED-SODA provides the dimensions needed to adjust the data governance process. It can be used for virtually any culture if not all.

And if you are having issues with building a data governance model, step into the reference model, for this is where you will get your basics for developing controls in data governance.

Even though the book is about data in a SOA environment, this is a book for every data analyst particularly the sections on data quality, data governance, and a myriad of thought provoking points throughout the book. and how bad data can become viral. These points and examples of what others have done will provide insight into your own issues and processes.


Friday, October 2, 2009

September Festival del IDQ Bloggers


With the month of September come and gone, the changing colour of the leaves starting, hockey season starting and the wind getting colder by the day, we found another month filled with interesting posts about data quality. This month I am happy to say, I'm hosting September's "Festival del IDQ Bloggers".

An annual data quality blogging carnival held by the International Association for Information and Data Quality, an international not-for-profit association dedicated to the development of the data and information and data quality profession.


The following is a quick list, I'll say quick but it actually took an excruciating long time to compile, split coffee on my keyboard, hit my head on the light and I stubbed my toe in the process ;-)
On with the data quality blog round-up...
From the DoBlog (http://obriend.info/) the personal blog of Daragh O Brien, IAIDQ Director and Information Quality consultant and writer. Since 2006 Daragh has been writing about Information Quality related topics (amongst other things) on this blog and has even won an Obsessive Blogger award for his writing on Information Quality topics.

We find two posts of interest one about the Law and the other about Market Research.

Blog Post: http://obriend.info/2009/09/25/finding-red-herrings-or-missing-a-trick/

Market Research often falls foul of poor quality information about the target sample population. In this post Colin Boylan (a freelance Market Researcher) discusses some of the issues that can lead to you chasing Red Herrings or just Missing a Trick.

Colin Boylan is a freelance market researcher living and working in Ireland. He has worked with many of the leading market research firms in Ireland and the UK, with particular experience in Pharmaceutical studies (where good quality data is essential).

Also in the same blogging journal we have an interesting tale about the law looking at data quality...



Blog Post: http://obriend.info/2009/09/29/a-game-changer-ferguson-v-british-gas/

For about 4 years, Daragh has been hammering on about how poor quality information can and could get an organization sued. In January it happened, with a very clear and explicit ruling in the Court of Appeal for England and Wales that sets a very interesting legal precedent (binding in England and Wales and persuasive in all other Common Law jurisdictions such as Ireland, Canada, Australia, India, USA, Pakistan....). This post (based on an article Daragh wrote for the IAIDQ in April) looks at that case and the implications for Information Quality professionals.

Daragh O Brien is a Director of IAIDQ, a Fellow of the Irish Computer Society and, after escaping from indentured servitude in a leading Irish Telco after 12 years is in the process of establishing a specialist Information Quality Management and training practice. He is also writing a book on legal issues in Information Quality with Fergal Crehan, a prominent Irish barrister (lawyer).

No Blog Carnival is complete without a post from the Obsessed Jim Harris in this short but sweet post Jim talks about knowledge and the fact that we know what we know, and we don’t know the rest. Something to think about as you read Jim’s post.






Jim’s OCDQ blog is an independent blog offering a vendor neutral perspective on data quality. A place where he offers a diversity of viewpoints in a collaborative style environment. Jim himself is an independent consultant, speaker, writer and blogger with over 15 years of professional services and application development experience in data quality (DQ), data integration, data warehousing (DW), business intelligence (BI), customer data integration (CDI), and master data management (MDM). Jim has worked with Global 500 companies in finance, brokerage, banking, insurance, healthcare, pharmaceuticals, manufacturing, retail, telecommunications, and utilities.

Jumping across the pond and over to Sweden, where I’ll take a moment and say hi to the Ericsson’s, Brit, Mikael, Max, Guztav and Hanah, I hope all is well. Then a quick move to Denmark where, we have DQ blogger Henrik Liliendahl Sørenson. A man of many talents, who has worked over 20 years in applications, databases and data in general. Henrik has demonstrated his expertise in business directory matching and international aspects of data quality improvement and master data management.
Henrik’s blog, Liliendahl on Data Quality, is a collection of his personal opinions, experiences and observations around data quality. Accumulated over decades and I do mean decades of experience.

Henrik discusses the multi-use potentials of data quality...could it be...can data quality be used for increasing revenues, and for marketing...read on and find out.


The post has a follow up post sparked by the comments: http://liliendahl.wordpress.com/2009/09/27/process-of-consolidating-master-data/

Going across the globe and going to that lovely local known as Australia we have Vincent McBurney a manager with Deloitte Consulting in Australia and has 15 years as an application programmer, database programmer, ERP implementer and information management consultant. The blog is dedicated to a tool based approach to data integration with news and tips on IBM InfoSphere, Informatica, Oracle, Microsoft and any breaking data integration news.

This particular post is interesting because it talks about something we all like - fudge. But not this fudge, no way, this fudge is actually fudging moments and then having to apply some kludge techniques or go in and kludge the situation to fix it. Fudge, Kludge all around a great read.

Blog Post: http://it.toolbox.com/blogs/infosphere/the-data-quality-and-how-to-fudge-it-34289

From the Data Quality Pro…Dylan Jones provides us an excellent interview with Ken O’Connor who discuss the a means in creating a data issue assessment process, something everything DQ Team should have in place, and that’s why it’s here. Coming from the trenches this is something any data quality analyst can use over and over again.


DataQualityPro is an online community resource that is solely dedicated to the needs and development of data quality professionals everywhere. Dylan Jones, is the founder and editor of Data Quality Pro and Data Migration Pro, leading online knowledge centre and community sites for their respective professions.With a 15 year background in data quality and data migration Dylan now supports a global community of several thousand professionals who actively collaborate and contribute to help increase the collective knowledge in these fields.

Here’s an interesting read about data governance from Gwen Thomas. Gwen Thomas is Founder and President of The Data Governance Institute, which is the premier provider of in-depth, vendor-neutral information about, and assistance with, tools, techniques, models, and best practices for the governance/stewardship of data and information. This is Gwen’s personal blog from the Data Governance Institute.

This post is here because data quality is a big piece of data governance. Data governance provides guidance in defining quality. There is a symbiotic relationship between the two.
With this post we get a high level view of the net-centric governance and the potential issues of control one may have with it. It gets the wheels turning when you begin to think of the implications that may and could very well follow.

Blog Post: http://datagovernancematters.com/2009/09/14/net-centric-data-governance-not-for-sissies/


Finally…
After hearing about the actions of an old friend of mine working hard in a data quality team, I was surprised to learn she is still having to justify the existence of the data quality team. It’s always good to know about what you’re doing and the value you bring to the table, but this being the 3rd time within a year, I believe enough is enough…so here it is. Who am I, well I am this guy, business analyst by day, data quality analyst by night. My blog, Data Quality Edge, is really a place to voice my opinions and what I hope will provide a grassroots look at data quality, something really for the data quality analyst in the trench. Because they are the ones that get the job done.


Wishing you all the best in the cooler months ahead! Good reading.


Dan



Wednesday, September 16, 2009

Stop Justifying Data Quality Programs and Do the DQ Work Already!

In a recent discussion with a good friend, I learned that they are in the middle of justifying their work in a data quality team. This being said, a few months ago they were doing it as well, and at the beginning of the year they had just wrapped up another justification project, in the beginning of the economic downturn, it was being done as well. I also know that a few years ago when I was with the team, we also had to do it.

It's a shame. A terrible shame! Some organizations understand the importance of data quality, sometimes that understanding has come at a cost:


• Lost thousands to millions;
• Faced national embarrassment;
• Or made significantly big policy screw-ups.

While other organizations, are more pro-active and have established a data quality team and program to prevent such events from happening. An activity that is considered a best practice and essential to any information technology/business intelligence structure.

However, in either case, you may have someone, traditionally a senior manager, who sees data quality as a cost, a black hole. Yes there is a cost, however the benefits outweigh the costs in a variety of ways.

• Reduction in re-work due to good data quality;
• Improved incoming data quality and data processing due to pro-active initiatives with incoming data migration and integration projects;
• Proactively preventing data quality issues from occurring;
• Improved decision making, using quality data, and more.


To my old team and senior management:


Stop with the justification exercises and begin looking at the benefits and what this dedicated group of data quality analysts have accomplished year after year.

• Recognized Finalist Best Practice by TDWI in DQ;
• Hundreds of data modelling, metadata, data processing and data corrections to incoming projects per year;
• Proactively seeks data processing improvements to improve data loads - ultimately reducing costs;
• Client support to decision makers who really don't understand the technology aspects of the data and its routines;
• Dozens of change management practices each year to improve data quality and data processing which collectively prevents lost revenues, increases sales and manages maintenance costs by reducing reruns and supporting programs such as customer profitability, and other CRM initiatives.
• The estimated benefits weigh in at an average of $1-1.5 million a year if not more.

Another justification exercise only takes the team away from doing what needs to be done, data quality.

So to the senior management in this organization and any other, yes there is a cost to any data quality program. Just remember a data quality team is your vanguard to any organization that deals heavily in data. They bring in benefit. They enable your decision makers. They protect your greatest asset - data!

A good DQ team = Great Value!

Friday, August 28, 2009

New to Data Quality Analysis Try These “9+1 Things To Do”!



Did you just get moved over from one data warehouse support group to another? Do you know nothing or very little about the data in your new data warehouse? Or are you new to data quality analysis and want to get started on some solid footing?

The following post by Sylvia Moestl Vasilik
“9 things to do when you inherit a database” at SQLServerCentral.com is an excellent article for anyone jumping into a new database environment, regardless of the environment\vendor or type of database relational\columnar, Sylvia’s 9 things to do can be applied anywhere.

Building on those “9 things”, if you are less technical and more into data quality analysis or into a data steward role, I recommend adding a 10th thing to do … begin and complete a data profile.

A solid data profile will provide you with a wealth of information and more. A solid data profile will provide you with some interesting insight into the data. Here are a few items that you should be able to capture with a good data profile project.

-- You will gain an understanding of the completeness of the data, you’ll see what’s missing and you can begin to ask the questions to the business users why are we missing this component of the data set(s).

-- How accurate is the data, does it meet the initial requirements or not. How often does a job fail because of bad data; have you lost customers, revenues or received fines due to bad data? You’ll discover soon enough how inaccurate data affects your organization.

-- How timely is the data? Do you have real-time, near real-time or less timely data. Is your data arriving late, on time or not at all? How long is the data relevant for, this will be important for you, your users and maintaining the environment.

Just remember focus yourself first on the most important data, the highly used data, then you can spread out and tackle the rest of the datawarehouse. Make sure you have senior management approval, and are able to prioritize the other 9 things to do along with this one.

Other items you can gather while running a data profile project can be identified from the following post,
5 Non-Quality Items to Consider in Data Profiling.