Whereas previously the speed of data coming from a limited number of channels (mail or telephone only, for example) was more limited and leisurely, current and new business processes invariably create new data, gathered at faster rates and in greater volumes. Data comes now also from ecommerce, loyalty program signups, purchase history, mobile location tracking, logistics planning and so on; and whilst old channels of communication may decline, they don’t go away, and additional new channels are appearing at a rapid rate – emails, social media, web pages, mobile telephones, tablets and smart watches, to name but a few.
The amount of data, and the speed, is increasing and many organisations struggle to stay afloat in this ocean of data, losing track of the requirement of maintaining high levels of data quality. Like many parts of business, data management is subject to trends and the dictates of fashion. It is vital, however, that data managers don’t remove their attention from the requirement to collect correct and accurate core data, such as personal names and addresses, which remain the heart of most corporate database systems.
Traditional addressing is part of the core of most databases and is usually an essential element of any data project. It remains the most solid way of identifying and matching a person within and between data files. Enrichment in terms of new data, such as geocodes, purchasing patterns, loyalty programs and address encoding must hook into it. It must therefore be accurate and of the highest quality to ensure that it supports the rigorous uses to which the data will be put.
An important, but overlooked, component in proper data management, and the root cause of a lot of failure, is that one needs to have a deep understanding of the real-world construct of which the data is a representation. Data workers often look at their data as the end and are quick to forget that it usually represents tangible reality. An example of this is understanding international data – personal names, addresses, dates, number systems and so on. Many projects fail because, though the data is technically sound, the understanding of what that data represents is not there, so that it is collected, stored and used incorrectly. When data is understood, and its real-world representations equally so, then a core of data can be created and added to without limit to provide greater scope and to future-proof that data. For example, locational information such as latitude and longitude or any number of address encodings can be added to a street address; loyalty scheme data, purchasing history, usage statistics and so on can be added to an individual person’s records, and so on. Good data management and understanding allows data to be used and managed without borders.
Data needs to flow between entities and departments within complex organisations such as health providers, global retail organizations and local government in a cohesive and standardised manner. Data is being collected at varying points and in varying ways, and risks becoming fragmented and isolated within data silos. In health services, for example, there are locational differences in data collection – a GP or hospital might collect certain types of data for a specific segment of the population that they are serving; yet this needs to be available and easily integrated with systems elsewhere when patients move or are transferred – there needs to be interoperability. Such systems can involve large numbers of staff, administrative and (in this case) medical, creating data and injecting it into the system. Training all these staff to conform to a set of data quality norms and templates is an impossibility – data quality needs to be achieved in cases like these by clever design and data validation technology.
Organisations need to create a culture of data quality as the default, and by design. Having good quality data and using it effectively are two very different things. The difference between businesses which handle their data well and those that don’t, is that the former has at least the option of making their decisions based on complete and accurate information.
Graham Rhind is an acknowledged expert in the field of data quality. He runs his own data consultancy company, GRC Database Information, in Germany, where he researches postal code and addressing systems, collates international data, runs a busy postal link website and writes data management software. Graham also regularly speaks on the subject and is the author of Building and Maintaining a European Direct Marketing Database published by Gower.