
|
Intelligent Customer Data Integration: Providing a Sound Foundation for CRM
This article explores why the single most challenging aspect is to recognize and determine the severity of data quality issues and face the problem head-on.
Corporate data is a key strategic asset, so ensuring its quality is imperative. Due to the tremendous amount of data being gathered and its variety of sources, data quality is often compromised—a common problem many organizations are reluctant to admit and address. This article explores why the single most challenging aspect is to recognize and determine the severity of data quality issues and face the problem head-on. Spending money, time, and resources to collect massive volumes of data without ensuring its quality is futile and only leads to disappointment. Cleansing data at the source significantly enhances the success of a data warehouse or CRM project. It is a proactive, rather than reactive, approach. As the amount of data escalates, so does the amount of inaccurate data. While data should be cleansed at the source, companies rely on many different data sources, making it difficult for businesses to get their hands around all enterprise processes and data infrastructure issues simultaneously. To succeed, there must be intelligent data integration (IDI). Not long ago, businesses were prone to ignore one of the most fundamental keys to success—customer relationships. Consumers had little influence on how businesses responded to their needs, and businesses had no reason to distinguish themselves through customer relationships and superior service. Businesses dictated the relationships with their customers, and customers had little to say in the matter. Today, much has changed. Consumers have many choices to meet their business needs. Successful businesses must provide a superior customer relationship to stand out from their competitors. Like the mom-and-pop shops that came before, successful corporations win and keep customers and convert prospects by establishing direct, sustainable, and manageable relationships. To maintain, manage, and track critically important relationships and associated customer activity, corporations are investing valuable time and resources into customer relationship management (CRM) systems. CRM systems are complex puzzles with many interlocking pieces, where each individual puzzle piece serves a purpose. However complex and detailed the individual pieces are, the CRM puzzle is not finished until all the pieces are integrated and the picture is complete. CRM is a business philosophy that aligns business strategy with customer information systems so that customer interactions can be managed for the mutual benefit of both the customer and the business. Successful CRM results include:
The goal of effective CRM systems is to provide the foundation for a successful relationship between your business and your customers, allowing you to stand out from the competition and create customer loyalty. The catch here—as many business executives and managers can attest—is that not all CRM systems are created equal. The definition of what constitutes a CRM system can vary greatly—different systems, with very different features, goals, and purposes are all classified as CRM systems. The effectiveness of a CRM system does not depend on the power and functionality of its individual pieces. Each piece within the CRM puzzle is an island isolated from the rest of the puzzle. The full picture is only possible when you can combine data and information from these isolated pieces. The effectiveness of a CRM system depends on the quality of data within and the ability to combine data across the different pieces. When data is inaccurate or erroneous, so is the information that flows through the system, and so are the decisions based on the information. The Case for Data Quality Rarely do you find technology professionals engaged in thoughtful discussions about data—it is just not an exciting topic at the forefront of business technology agendas. Unfortunately, companies often do not identify data quality problems until there is a direct, negative impact. Historically, before the advent of new, easy-to-use technologies, data quality has been difficult and expensive to attain. Quantifying return on investment when implementing a data quality solution has been very difficult. For these reasons, poor data quality is rampant in many organizations. Corporations are starting to realize the impact of having bad data. According to a senior research analyst at a Gartner Group Symposium, by 2005, more than 50 percent of data warehouse and CRM projects will fail. One of the points of business failure: denying data quality issues. To understand why CRM depends on the quality of the data, you have to understand how CRM components interact. CRM systems are comprised of a number of various customer touchpoints, and the data is only accurate if it is integrated from all customer touchpoints. Electronic points of sale, call centers, direct mail catalogue orders, credit card transactions, bank transactions, online transactions, and electronic mail are all data entry points. The deluge of data can impede an organization’s ability to effectively manage and monitor key customer information. Data travels between a corporation and its vendor and partner systems and streams into the enterprise by various means, including unstructured Web responses made directly with customers, point-of-sale systems, and call center operator data entry. The increase in data and system complexity alone can lead to ambiguous data representations. As an example, names and addresses can be depicted in various ways. For customer Joan E. Smith, a call center representative might enter the following: Joanie Smitt, 100 E. Johnson Street. The marketing department, however, might use a different designation received from a third-party list: Jon E. Smith, 100 East Johnson Street. Ms. Smith may have entered J E Smith with an address of Johnson Street on a customer Web site. A corporation may know the same customer in several different ways.
In individual systems, different stored data for the same customer may be tolerable because the operational activities operate as discrete applications. But as information is acquired by these applications and integrated into a comprehensive customer database, it is imperative that different representations of the same customer be consolidated into the representation of a single customer. Storing and propagating inconsistent data for the same Joan E. Smith will produce data quality problems, as there is no single accurate representation of the information. These types of data quality problems ultimately impact the relationship the corporation has with Ms. Smith. Data, especially customer data, will change over time as customers move or companies grow. Consider the longevity of valuable customer data. A recent research report completed by TDWI entitled Data Quality and the Bottom Line stated, “The problem with data is that its quality quickly degenerates over time. Experts say 2 percent of records in a customer file become obsolete in one month because customers die, divorce, marry, and move.” To put this in perspective, assume that your company has 500,000 customers and prospects. At this rate, 10,000 records are obsolete in one month—that’s 120,000 records every year. It is estimated that more than 20 percent of the data within a corporate customer database can be erroneous, redundant, or otherwise unusable, having an immense impact on the customer relationship. In fact, it is not unusual to discover more than half of the customer records have some type of data problem. Some industry examples illustrate the problem:
Critical customer relationship decisions, price changes and discounts, marketing campaigns, credit decisions, and daily operations revolve around key customer data. A company’s success or failure depends on the quality of information contained within its CRM systems. Yet, data quality is often ignored. The perception is that implementing a data quality initiative is costly, resource intensive, and time consuming. The challenge is to be able to quickly and accurately capture, standardize, and consolidate the immense amount of customer data that comes from a variety of contact points. CRM Defined Ask 10 technology professionals what CRM is and you’ll likely get 10 different answers. This isn’t surprising because there are so many different pieces to the CRM puzzle. CRM is difficult to define as one continuous end-to-end process—it encompasses all aspects of a business interacting with its customers. There is no single CRM tool. Although there are many pieces to the puzzle, the pieces can be categorized into three large buckets. Tactical CRM: The tactical components (also called operational CRM components) are responsible for the execution of a single business function. These components manage the automation of horizontally integrated business processes, including sales force automation systems, call center applications, Web site e-business management, customer service systems, and campaign fulfillment. Strategic CRM: Strategic components (also called analytical CRM components) provide information and decision support based on data from the operational systems and other data feeds. Strategic CRM is most often manifested in analytical applications that leverage data marts based on a customer data warehouse. Typical strategic CRM components are churn analysis, customer segmentation analysis, campaign analysis, and operational efficiency, enabling organizations to answer such questions as:
Customer Contact Components (also called customer data integration): Customer data buried in multiple enterprise systems is worthless unless it can be quickly linked and easily accessed. Customer data integration components will provide customer data identification, customer profiling, mapping and linking capabilities, customer data cleansing, customer data enhancement, and meta data management.
Intelligent Data Integration Functionally, most organizations have separate sales, support, and marketing groups. It is extremely difficult for businesses to get their hands around all customer processes and data infrastructure issues simultaneously. It is impossible to complete the CRM puzzle when most available systems today are isolated from each other. To complete the puzzle, you need to integrate this data—the goal of intelligent data integration (IDI). Most systems assume that the data is pristine prior to use by the rest of the CRM system. In nearly all cases, this could not be further from the truth. The goal of IDI is to provide a complete and accurate customer profile. One approach for maintaining data integrity is to attack the problem at the operational system level—a great idea but one not often practiced. Operational systems are where detailed transactions are completed. These applications are dedicated to performing one function that represents specific business requirements. The data collected is a byproduct of the executed transactions, and the applications found here are not integrated with other applications—each application is a stand-alone environment and optimized for the particular needs of the application. While the data is optimized for the operational system, the full CRM system needs to consolidate that data by customer into a single customer-centric database. This gives rise to many problems of integration:
IDI attempts to know the customer at each touchpoint across every line of business. This requires an accurate, coherent customer view. Specifically, the goals are to:
The Four Cornerstones of Intelligent Customer Data Integration Intelligent customer data integration provides the infrastructure for transforming raw data into accurate, consistent, and usable corporate information assets. The goal is a unified, complete data repository (database or data warehouse) of your company’s customer data. The foundation for customer data integration consists of four cornerstones of data quality technology. The IDI process, shown in Figure 4, requires different steps and rules for different data sources. However, the basic process is consistent.
Data Auditing Before any other phase or process occurs, a thorough audit and discovery process is needed. For IDI we identify the extent of the problem. We also want to create rules generated from the existing data to clean and enhance data later. Sometimes this means looking for data outliers. In a name field, outliers could be first names of more than 20 characters. A good data auditing process also looks for missing data. We also want to identify maximum, minimum, and average values for numeric data, which could indicate problems that might be endemic to the data entry system. Another aspect of data auditing is the standardization report. Here, the rules are discovered and later applied to the table or database to bring uniformity to the data. For addresses, this might mean always changing the word “street” to “ST,” improving the odds that the address can be matched to similar addresses in later IDI stages. Rules also provide a standardized way to display data for Web applications or mailings. Other data auditing tasks might examine table relationships, primary key relationships, data interdependencies, and flagging known erroneous data (e.g., a phone number that reads 555-555-5555). Now that we have found the data problems, let’s begin to fix them. Data Cleansing—Parsing Data correction and standardization is often referred to as data cleansing or data scrubbing. The first step in data cleansing is parsing, which locates, isolates, and identifies individual data elements in customer data. Customer data components might include salutation, first name, middle name, last name, title, street number, street name, apartment number, city, state, ZIP code, and so on. Parsing data makes it easier to correct and standardize (see Figure 5).
Parsing identifies incorrect data within particular fields, providing the ability to process data with inconsistent formats. It is required prior to any data correction or standardization step. Parsing inconsistently represented data and creating a consistent set of data elements prepares the data for the correction step. You must be able to process free-formatted lines of name and address information.
Standardization Once parsed, you can begin to standardize your data. Customer data collected from different sources, or data in a single source, has many problems:
Standardization provides customer information in a preferred and, more importantly, consistent format. Standardization permits you to remove ambiguities from your data. The biggest challenges for accurate standardization of customer data include:
Correction Non-name-and-address data correction requires software to check for missing values, check ranges, and verify formats (see Figure 7). This type of correction is based on business rules and values that must be stored as part of the IDI meta data and used to correct data values as specified in the meta data.
Name and address data require specialized correction techniques. Often, business rules associated with name and address data call for a comparison of existing data against, and updates using, a reference table. Files such as the USPS address database provide a perfect reference for this type of data correction. Name and address correction requires the ability to:
Matching and Consolidation Correcting and standardizing data prepares the data for the next steps in the IDI process: matching and consolidation. Matching lets you identify similar data within a single data source or across multiple data sources. Using standardized data, you can eliminate duplicate representations or find duplicate representations and consolidate information across records. There are many ways to match data: on name, name and address, company name, or other defined business rules. Matching rules are dictated somewhat by the type of consolidation required. There are a number of different types of data consolidation, from consolidating all records for a single customer to consolidating all records for a group of customers based on particular business rules. For example:
These techniques need to be performed both between the records in a single file or table and between the records in multiple files or tables. In addition, the ability to match records with external, third-party data files is an absolute requirement of IDI. An additional critical component of matching and consolidation is the ability to choose surviving information when records are consolidated. You must be able to choose records from one or more input sources based on established business rules. Merging information requires you to choose where surviving information is selected. In short, the matching and consolidation software must have the ability to (see Figure 8):
Other Channel Information CRM is about the customer, so most applications have customer name or account information that can be linked to a customer profile. The goal of matching and consolidation is to link these customers together. For companies with a Web site and Internet customer interactions, clickstream behavior data needs to be merged with all of the other customer data for a complete profile. When customers access your Web site and begin to browse or download content, they are engaged in an electronic dialog with your company. They are asking for information, but they are also providing information. Every visit to a Web site leaves a trail with the potential to tell you how customers use your site and whether your site delivers what they need. Effective IDI must integrate this information and requires the ability to view Web log information and match Web site user IP addresses to the customer profile. Comprehensive IDI will match this Web transaction data back to the customer. Closely tied to the recognition of the particulars of a data quality or data integration project is the realization that data does not always have to be “corrected” after a problem has been pinpointed. In fact, a better approach to the concept of a one-off data quality project is to solve data quality problems at the source, where data enters an organization’s data flow. Similarly, solving data quality issues may mean not moving data at all but leaving it where it resides rather than assuming that data must all be consolidated into a central store before a data quality initiative can be considered successful. Defect Correction Certainly, data correction is the way to get at enterprise data quality. Erroneous or incomplete data must be corrected or enhanced before its value is made evident. But solving the data correction problem can also be seen as the first step in a larger process that will ultimately provide an organization with cleaner and more useful data. Defect Prevention In the defect correction phase, data and processes are analyzed, rules and patterns are discovered, and those transformations are applied directly to existing data. The next logical step in this process is to take the rules and schemes we created from one data set and apply them to others with the hope that we have now created repeatable data quality or integration routines that can be successfully applied over the entire enterprise. Yes, you could continue to apply these procedures to existing data on a regular schedule to keep data in its optimal state, but what if you took the same rules and validated data as it enters the system? You now have moved from defect correction to defect prevention, a stance that can solve data quality problems before they happen. Defect prevention is often mentioned in the same breath as real-time data quality tools. What this really means is that a metaphorical barrier is erected on the outskirts of your enterprise that only allows data in a predetermined format or condition to move from external sources (such as a customer entering his or her own personal data on a self-service Web site form) into the internal corporate data stream. This implies that you have to know what the desired data format or quality is before it moves downstream. The only way to accurately accomplish this is to examine your existing data and database structures, extract, and modify business rules based on that data or structure, and apply them as filters, in a sense, to the data that is about to breach your organization’s defenses. Integration As a corollary to the rule that defect prevention is a better alternative to data correction, we could say that integration is a better alternative to consolidation. Integration (sometimes called virtualization) is the strategic application of match codes and linking logic that allows an organization to get a complete and timely view of corporate data without having to create a central consolidated data store beforehand. The value of defect prevention is that no bad data is allowed to move into the system and downstream where it could pollute otherwise clean data. Integration states that data does not have to move at all to achieve higher data quality. Customer data in a marketing department database can be logically and virtually linked to extensive customer data in a data warehouse, which will offer a more complete view of a particular customer record without having to actually combine records (a process that could potentially endanger the accuracy or usability of the data in its original form). Enhancement The final step in IDI is to obtain the complete picture of your customer. Using the data cleansing and consolidation steps above, you have succeeded in getting the best possible representation of a customer that your company can provide. However, this is probably not sufficient to give you a competitive advantage, a complete customer profile, and the complete CRM needs. Data enhancement provides valuable additional information about customers, both corporations and consumers. Enhancing data involves the addition of data to an existing customer. Advanced IDI enhancement technology allows the augmentation of external data to existing customer data without extensive programming. For example, organizations that want to offer different campaigns to prospects in different geographical areas need to add geographic data to their customer profiles. A company may want corporate credit information supplied by a third-party data provider. Enhancement technology allows you to enrich your existing customer data with:
Many service bureaus offer data enhancement, but this is often timeconsuming, expensive, and requires you to provide your customer data to the service bureau. Today’s enhancement technology will allow you to enhance your customer data without the time and expense of securing a service bureau. The Need for Data Quality Tools Once you have determined that intelligent customer data integration is imperative for your company, how do you intend to implement IDI? One approach is to do nothing at all. Of course, if you do nothing, they will not be your customers for long. Make too many customer mistakes and the customers’ confidence in your company erodes. Additionally, your company will encounter many “hidden” problems—lost revenue due to poor customer billing information; sales lost to competition from poor customer contact; missed up-sale opportunity; and so forth. This approach is like entering the boxing ring and leading not with fists but with your chin. It is clearly not the approach that most proactive organizations would choose. A second approach is to gather a group of well meaning and attentive clerical staff and have them pore over customer data. This approach is better than letting the customer relationship suffer, but is a terribly time-consuming and expensive approach. In addition, this approach is prone to errors. The most effective approach to improving the quality of information about your customers is to use as much automation as possible. Table 1 provides a list of data quality/data integration vendors and their products. However, auditing the data as it stands is not enough. You must also attain complete customer profile information. Here, data quality tools must:
Finally, all rules and processes needed for these tasks should be consolidated into a single meta data repository that can be shared by all components of the CRM system. Every puzzle begins with the first piece and builds from there. The first piece of the IDI puzzle is to take stock of your systems. Only then can you begin to uncover the integration issues within these systems. Understanding the problem is a large part of the solution. Begin with discovery, end with enlightenment.
Recent articles by Tony Fisher
Tony Fisher -
Tony Fisher, President and CEO of DataFlux, was the Director of Data Warehouse Technology at SAS prior to the SAS acquisition of DataFlux in June, 2000. Fisher has been a key technology leader at SAS, providing the engineering research and development direction. Fisher’s prior role at SAS gave him critical experience in the marketplace in which DataFlux operates. Fisher is a native of North Carolina and earned degrees in Computer Science and Mathematics at Duke University. Prior to working at SAS, Fisher worked at Digital Equipment Corporation as a software developer. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||