-
Marketplace
-
Channel Resources
Articles from this Site
Experian QAS Announced QAS Email and Phone
Rotary International Selects DataFlux
IBM Introduces New Versions of Two Software Products
Experian QAS Selected by Two State Unemployment Insurance Programs
Emerson Network Power Selects Silver Creek Systems
White Papers
Data Warehousing Ensuring Data Integrity
Making Data Work: Addressing Data Quality at the Enterprise Level
Can your SharePoint Backup Harm Your Business?
The Value Behind Integrity
Building Profitable Customer Relationships and Personalized Retention Strategies
Web Seminars
Master Data Management: Best Practices for Success
Getting In Synch: Creative Ways to Reconcile Data Between Apps
Closing the Loop: Real-Time Event Detection and Response
Books
Corporate Information Factory, 2nd Edition
The Data Warehouse Challenge: Taming Data Chaos
Data Quality for the Information Age
Improving Data Warehouse and Business Information Quality: Methods for Reducing Costs and Increasing Profits
Metadata Management for Information Control and Business Success
How do you measure/calculate information quality quotient for a particular data set?
Question: How do you measure/calculate information quality quotient for a particular data set (i.e., a single value of quality for a data set)?
Danette McGilvrays Answer: The good news is that information quality can be measured - via various data quality dimensions. A data quality dimension is a way to measure and manage data and information quality. I reference 12 data quality dimensions that I believe are the most practical and useful for the business to measure and manage. Some of the dimensions are data integrity fundamentals, duplication, accuracy, consistency and synchronization, timeliness and availability and data coverage. There is no industry standard for data quality dimensions. Choose those dimensions most applicable to your situation.
There is not room in this forum to describe each of the 12 dimensions. But let me point you to one of the first dimensions to measure - that of data integrity fundamentals. The dimension of data integrity fundamentals is a measure of the existence, validity, structure, content and other basic characteristics of data. This dimension includes essential measures of completeness/fill rate, validity, frequency distributions and lists of values, patterns, maximum and minimum values, referential integrity, etc. All other dimensions build on this dimension whether you are assessing your data for the first time to prepare source-to-target mappings, using assessment results to clean source data or develop transformation rules or monitoring the data regularly within your production environment.
To assess data integrity fundamentals you will need to profile your data. Profiling can be accomplished by using one of the profiling tools available on the market. Profiling tools are sometimes referred to as analysis or discovery tools and provide the most comprehensive information about your data. You can also use other tools to profile your data such as using SQL to write queries, some type of report writer to create ad hoc reports or a statistical analysis tool. A caution just having a tool is not the full answer. You need to make sure the processes around using the tool are also well planned and implemented.
If you are looking for a single data quality indicator, you will need to measure various dimensions important to you and combine them into a data quality index. The index is a single indicator, which is actually a compilation of several measures. Of course, any data quality results that are reported should be explained so those utilizing the reports understand what is being measured.
Additional information on data integrity fundamentals, profiling tools and the other data quality dimensions can be found in my upcoming book Executing Data Quality Projects: Ten Steps to Quality Data and Trusted Information available summer 2008.
Danette McGilvray is president and principal of Granite Falls Consulting, Inc., a firm specializing in information quality management to support key business processes around customer satisfaction, decision support and operational excellence. Projects include enterprise data integration programs, data warehousing strategies and best practices for large-scale ERP data migrations for Fortune 50 organizations. For more than ten years she led information quality initiatives at Hewlett-Packard and Agilent Technologies. An accomplished program manager and facilitator, she is an internationally respected expert on data profiling, metrics, quality, audits, benchmarking, and tool acquisition and implementation. McGilvray is an invited speaker at conferences throughout the U.S. and Europe, where she trains other industry experts in enterprise information management and data stewardship. You can reach her at danette@gfalls.com.
For more information on related topics, visit the following channels:


