In Data Quality

Guest post by David Raab

Marketing data quality used to be simple: the more comprehensive and accurate a data set, the higher its quality rating. That made sense when most marketing programs were targeted at known individuals. Knowing exactly who you were reaching (accuracy) and reaching as many qualified people as possible (coverage) were the keys to success.

But today, many marketing programs reach people you don’t know. They’re targeted through locations, Web behaviors, search terms, social networks, and by other methods that isolate useful groups whose members remain otherwise anonymous. The quality of those data sources is measured in new metrics that vary considerably depending on the particular marketing program.

Consider speed. Identifying changes quickly was always a part of the traditional accuracy measures, but elements like name and address don’t change very often. By comparison, a marketing program based on location may need to react instantly to someone walking past your physical or virtual storefront. So it’s critical to measure how quickly a data source acquires location information, transmits it, and lets you respond with a relevant message.

Other new measures include:

  • Reliability: many of the new data sources involve real-time or near-real-time data connections. A source that is frequently unavailable or suffers unpredictable lags in transmission time may be significantly less valuable than another that is more dependable. Reliability often interacts with speed: data that is sometimes current but other times late may cause expensive gaffes – like paging Elvis after he has left the building – that are worse than not delivering the message at all.
  • Consistency: this is a close cousin to the traditional measure of accuracy. The difference is that many new data feeds are themselves aggregated from multiple sources of varying quality, so you need to watch carefully to see how the average accuracy changes over time and for different sub-segments. For example, information that infers consumer interests from the Web sites they visit may be very good at identifying people in the market for a new car but less effective at isolating heavy users of packaged consumer goods.
  • Specificity: how precisely does the source allow you to target? It’s one thing to identify book readers and another to find people who are interested in Civil War history. Similarly, location-based targeting might be as broad as a metropolitan area or as narrow as a specific street corner. As with all quality measures, the level you require will depend on how you’ll use the data.
  • Clarity: you can’t always tell what a particular attribute actually measures. Proprietary scores for “social media influence” are one example – even if the vendor explains how they’re calculated, it’s often not obvious what to make of them. Seemingly concrete classifications can also be fuzzy: what, exactly, makes a sewing machine “portable” or a company “small”? The answer matters because decision rules may be based on assumptions that are incorrect.

These new measures are more closely tied to specific marketing programs than the traditional, general measures of accuracy and coverage. Indeed, many new marketing programs are only viable if their data sources meet specific standards. This program-driven approach expands the challenge of picking the right quality measures. But it also simplifies the calculations needed to identify the financial impact of quality improvements. The resulting clarify may lead to a new golden age of data quality – if we can develop the tools to measure it.

David M. Raab is a consultant specializing in marketing technology and analytics. He is author of The Marketing Performance Measurement Toolkit and B2B Marketing Automation Vendor Selection Tool. See www.raabguide.com for more information.