FEDERAL TRADE COMMISSION Public Workshop: THE INFORMATION MARKETPLACE: March 13, 2001 Opening Remarks Session One: Merger & Exchange of Consumer Data: An Overview Session Two: Consumer Data: What is it? Where does it come from? Session Three: What are the business purposes for merging and exchanging consumer data? Session Four: How do merger and exchange affect consumers and businesses? Session Five: Emerging Technologies and Industry Initiatives: What does the future hold? P R O C E E D I N G S MR. WINSTON: Let me introduce myself, I'm Joel Winston, Acting Associate Director for Financial Practices at the FTC, and I want to welcome all of you to the Federal Trade Commission, and give a special greeting to those people who are listening in on our audiocast on the website, ftc.gov. Now, there are several members of the Commission who are going to be giving some opening remarks this morning, and I would like to introduce first Chairman Robert Pitofsky. Chairman Pitofsky has served as chairman of the FTC since April of 1995, and he will be beginning the proceedings. Mr. Chairman? CHAIRMAN PITOFSKY: Good morning, everyone, and welcome to another of the Federal Trade Commission's workshops. This one, we have entitled The Information Marketplace: Merger and Exchange of Consumer Data. I don't think I have to belabor the point with this audience that privacy, especially privacy in the commercial marketplace, is and remains a very important issue. If you take polls, you find today, just as you did three and four years ago, that somewhere between 88 and 92 percent of consumers when asked what their concerns were about doing business, buying online, will say that they have reservations, and think it's not a secure marketplace. They're not giving their credit card online without having some knowledge of how it's going to be used. As a result, you now have, I think, just since Congress reconvened, something like a dozen bills addressing various issues relating to privacy in the commercial context. But let me position this workshop. We are not looking for enforcement targets for companies that may be invading unfairly or deceptively consumer rights, and we're not looking for legislative proposals. This is another kind of workshop, and it's like many that we've conducted in the past five or six years. We're trying to find out in a new area, a fast-changing dynamic area, what's going on, so that we are informed about the kind of issues that eventually we'll be called upon to address. We did that with our earliest privacy workshops, just to find out how personally identifiable information was collected and whether or not it was being sold. We did it with profiling, more recently B2B commerce on the Internet, and wireless technologies. In this instance, we would like to be able to take the measure of the extent and the ways in which firms exchange information and data that create consumer profiles; not necessarily only the information the firm collects itself, but information that someone else collects that then becomes merged into a firm's database. How is that information used commercially? Is it used commercially? And if so, in what fashion? What is the source of the data? Is it mostly online, is it offline, is it a combination of the two? Does it come from public records, private records, a combination of the two? We know that the ability of firms to collect data has been enhanced dramatically over the last five to ten years, and what we want to find out is how it's being used so that down the road we can spot issues. It is an information-gathering enterprise. It is not designed at the end of the day, at the end of these sessions, to come up with policy proposals. We have no predisposition on this. My own view, as some of you have heard me say before, is that this kind of enterprise is what Congress had in mind in 1914 when it created a Federal Trade Commission. Not just law enforcement, but a group that would try to work with the business community, with consumers, and others, to understand new and emerging dynamic trends in the economy. That is what we've been about over the last five or six years. We've tried to restore that tradition, and I certainly feel that this workshop moves in that direction. We have a wide variety of people here today who represent the business community, the consumer community, academics, and others, and if history is any guide, we will at the end of the day have learned a good deal from each other. With that, we'll receive some words on video from my colleague, Mozelle Thompson, but while that's being set up, let me introduce my colleague and friend, Commissioner Orson Swindle. COMMISSIONER SWINDLE: Thank you very much, Chairman Pitofsky. I would like to welcome you all here, and before I forget it, the last couple of days in preparation for this, Bruce Jennings and his crew of youngsters around here have been scurrying in about 9,000 different directions making all this come together. Wires have been dragged all over the building and I think we've got a good set-up here, and this will be recorded for posterity and hopefully there won't be too much blood on the floor when it's all over, but it's a delight to see you all. I know so many of the organizations that are represented here, you have a vital interest in this, certainly from a personal perspective of your business, but we are all, as the chairman says, grasping to understand. And I would hope that we would view this process here today, as we have in previous workshops, as the Chairman mentioned, as a learning process in which we listen and offer our suggestions from time to time, but mostly we listen to you, the practitioners, and try to get a better understanding of what we're all about and what we're doing here with this very controversial -- is that a good word to describe it -- but the issue of information flow and its effects and the concerns that various and sundry people have today in the consumer population or in business population. I do want to welcome you all here today. The use of third party information from public records, information aggregators and even competitors for marketing has become a major facilitator of our retail economy. Even Chairman Greenspan suggested here some time ago that it's something on the order of the life blood, the free flow of information. This was made even more clearly by a new study released yesterday by the Privacy Leadership Initiative and the ISEC Council of the DMA. The study made it clear that consumer prices would increase if public policy significantly limited the flow of data into catalog marketing and sales. At the same time, the digital revolution, both online and offline, has given an enormous capacity to the acts of collecting and transmitting and flowing of information, unlike anything we've ever seen in our lifetimes. Obviously the debate has been furious over the appropriateness of these data flows, this passage of information from one entity to another. The perceived harm that this data flow causes and what the appropriate remedies might be. As we all know, we've had a heavy debate on privacy going on now for at least three years, I've been here three years, and it was going on even before I arrived. I believe that issues related to the real harm that might be caused are well addressed by existing laws, but now we need to explore issues related to customer or consumer and business entities or the seller and the buyer, if you will. It is also useful to note that the digital revolution has revolutionized the knowledge that the buyer has about the marketplace. Buyers today are more informed than they have ever been ever before. The information age and information technology is literally changing the way every one of us does business, the way we conduct our lives, how we pick and choose, and certainly this information flow has made the buyer far more informed. It is crystal clear that there have been quantitative and qualitative changes in the marketplace, and the manner in which information is made available and used. There are real benefits in this for both consumers and businesses, from these changes. There are also changes in the way we all interact with each other. More of the interaction is being defined by data and less by each of us based on what we reveal about ourselves. The FTC has traditionally dealt with harm that comes from bad actors and market failures. The issues being raised today don't necessarily fall easily into either of those categories. Such as the challenge that we face. Productivity gains are well documented and the new technology, as I said earlier, is changing the way we do everything. However, there is a great trust deficit in existence out there now. The public has concerns about the private sector's ability to govern information use, or manage that information that they happen to have on people. At the same time, the same observations will tell you that the public has great concern as to what the government does with the information it has. And I would contend that we might ought to be a little bit more concerned about what the government is doing than the private sector, but nevertheless, we've got a great distrust going here between the consumers who more and more today understand the value of their information, and what goes on around them. We therefore have a dilemma. The use of information drives our economy, I think that's pretty well established. That includes information to make sales, marketing and customer service more efficient, and more effective. The information flow allows businesses to build the right product, deliver it at the right time, to the right place, to the right address, and meet the demands, unique as they are, among all consumers, carefully tailored to them. That I would suggest most consumers would say not a bad deal. However, this increased use of information about people creates consumer concerns. The public is concerned about the potential misuse of the information, and individuals are concerned about being defined by the existing data on themselves. This is a huge misunderstanding deficit that parallels and matches the trust deficit. Consumer education has lagged market changes driven by new technology. Government is behind the new technology changes, too, as we've all noted. Consumers struggle to understand the technology itself, not just in the ways in which a technology is used in the marketplace, I'm still wrestling with my ISP, I was about to use a name there, but I won't. I'm having so much trouble with it, I don't want to defame the country at this point in time, but I'm having trouble with the technology itself, not to mention the information flow. Today's workshop is a great opportunity to begin to bridge this learning gap and this trust and misunderstanding or untrust and understanding deficit. We're here today to gather facts and begin to understand the flows of data that support marketing and customer service. This should increase our understanding of the benefits of the free flow of information, and to begin to understand the level of real harm, to whatever degree it might exist, related to information use. And perhaps we have an opportunity to ease the fears that are related to that emotion of fear of the unknown. I would suggest, plead with, counsel all participants to please leave your emotions at the doorway. This session today, folks, please, is not about sound bites, it's not about exposing people in public, it is about learning and sharing what we each know and how we go about doing what we are concerned with, and understanding how to balance legitimate privacy concerns and economic and social benefits. Remember, today's objective is to learn, to explore, and perhaps start to identify so we can put our hands on it, some policy approaches that are balanced in their -- they're balanced in a sense that they balance the consumer's interest in choice and economic opportunity, they balance the consumer's interest in not being harmed by security breaches and data misuse, they're balanced in the sense that they respect the consumer's interest in choosing when to not participate in a market, and also the other side of the coin, so to speak, is business interest in serving all markets in a most effective and efficient and, quite frankly, profitable way that they can. That's what you are our free enterprise system is all about. I thank you again for joining us. This is an important session. Perhaps it's the first of several important sessions on the very subject, because I think we have a lot to learn and we appreciate you coming here and being a part of our family and helping us learn more, learn faster, and hopefully, as I always say, helping us to look before we leap. Thank you very much. (Applause.) COMMISSIONER THOMPSON: Good morning. I would like to join the Chairman in welcoming you to the FTC for this important workshop on the Information Market Place. As he mentioned, today we will all be sharing what we know about the topic of Merging and Exchanging Consumer Data. It's no secret, for example, that the Federal Trade Commission has been long talking about issues dealing with personal data and privacy. I think that today we will be talking about how the issues raised with data collection converge when we're talking about an online and offline environment. At present, there are some real reasons to distinguish those two classes of information, in light of the speed and the manner in which information is collected. But I also recognize that, as a practical matter, it doesn't make sense for consumers and businesses to view separate protocols for online and offline data collection. So, I would encourage industry and consumers to work together to formulate practical solutions that foster consumer confidence. But there will also be some important other questions that you'll be dealing with today about issues like legacy data, information that was collected before there was an online environment, and, also, how information changes -- does the character really change when you have offline data, including public information that's merged with online data and made available in a mode like on the internet. I look forward to hearing your presentations and hope that you'll enjoy the day. Thank you very much for coming. MR. WINSTON: Before we get started, I have a few ground rules and announcements to make. The first one I approach a little bit gingerly, but I have been asked to ask all of you to turn off your cell phones. I'm just the bearer of bad tidings here. Apparently there's some feedback between the cell phones and our equipment, and it's messing everything up, so if you could please turn off your cell phones. Also, I would like to remind our panelists that because we have so much ground to cover today, we're going to try to hold you to the time limits that we've discussed with you previously. We're going to give you a one-minute warning before your time elapses, and then when your time is up, we're going to gently encourage you to conclude your remarks. If that doesn't work, we have someone with a hook who's going to come out and kind of pull you away, but if you could try to stay within the time limits. Also, it's our practice in our workshops to invite the audience to ask questions of the panelists, if time permits, at the end of each panel. But, again, because we have so much ground to cover, I'm going to ask the questioners to limit themselves to asking questions and not to make any statements for the record. Which brings me to my last announcement, and that is that the record of this workshop is going to remain open for 30 days, until April 13th, so that anyone who wants to file something, a comment or other materials, for the record, and for the Commission's consideration, can do so. The instructions for filing these post workshop comments are available on our website at www.ftc.org. So, I encourage you all of you to participate in that process. Dotgov, I'm sorry, somebody gave me the wrong web address here, okay. Anyway, I encourage you all to submit comments if you like. Now we're ready for our first panel, in which Professor Mary Culnan of Bentley College will lead a discussion designed to provide an overview of the flow of data through the information marketplace. Professor Culnan is the Slade Professor of Management and Information Technology at Bentley College in Waltham, Massachusetts, where she teaches and conducts research on information privacy. She is the author of the 1999 Georgetown Internet Privacy Policy Survey, and was a member of the FTC's Advisory Committee on Access and Security. And Professor Culnan will introduce the members of her panel. SESSION ONE: MERGER & EXCHANGE OF CONSUMER DATA: AN OVERVIEW MS. CULNAN: Thank you, Joel, and thank you to the FTC for inviting me to participate in this workshop. It's going to be a terrific day. One comment about our session. We were instructed we're not going to have Q&A at the end of our session, because we're just providing an overview, so I didn't want you to think that we're cutting off the flow of discussion arbitrarily. What we are going to do today is we're going to talk you through a slide, which I'm going to put up here, and which you also have in your packet. Because the other two people are going to be having their own slides. We're going to talk you through this 30,000 foot view of profiling to set up the rest of the day's sessions. And so, if we skim over a topic that you think we should have gone into in more detail, you will hear about this in more detail in the other sessions later on today. We're going to focus primarily on the compilers, the third party organizations that collect, slice and dice and then resell consumer data (but these firms do not have a direct relationship with consumers), rather than focusing on the profiling that's done by individual firms with their own customer data. And for the purpose of simplicity, we're also not going to talk about co-op databases, which fall into the category of third party organizations that collect information on customers, because there's such a small number of these systems, but for some of the things that we're going to talk about, they also fall into our slide. So, let me first introduce our two panelists. First is Johnny Anderson, who is the president and CEO of Hot Data, Incorporated. He has over 30 years of technology industry experience, holding executive and management positions at e2 Software Corporation, Saber Software Corporation, Novell, Excelan and Digital Equipment. Our second speaker is Lynn Wunderman, who is the President and CEO of I-Behavior, Incorporated. Prior to founding I-Behavior, she was the founding partner of Wunderman, Sadh & Associates, which is a consulting firm specializing in information-based marketing services for both consumers and B2B marketers in the financial services, high-tech graphic arts, non-profit and Internet industries, and President and Chief Operating Officer of Marketing Information Technologies, a company providing database services for major Internet and Fortune 100 companies. She currently serves on the Internet committee of the board of directors of the Direct Marketing Association. So, what we've done, we've divided the slides into thirds. I'm going to discuss the first part which is on the left, this is the consumer part where consumers generate information in our daily lives that ends up in a compiler's database. Johnny Anderson is going to discuss the middle part of what goes on in the compiler's black box, and Lynn is going to discuss the third part on the right, how compiled data is used to generate offers to consumers, both prospects and consumers. And then as you can see, our picture begins and ends with the consumer, which is an important point I think. After I attended my first DMA convention and went through the exhibits, I came away convinced that anything anybody does puts you on somebody's mailing list or you end up as a record in somebody's database. And the slide shows some of the main ways that this can happen. First of all, all of us generate a number of public records, depending on the kinds of activities we engage in. Some of these include personally identifiable information such as property records, which do have our name and address attached to them, or telephone directories or other directories, and then there's public records that have nonpersonally identifiable information in them such as census records. And compilers can acquire this information in two ways. First they can acquire it directly from the source, so they could buy the records from the state or local government. Or they may acquire the information from a second firm, such as Claritas, that acquires this information and does some analytics on it and then generates geographic and demographic profiles that do not include personally identifiable information but can be overlaid on top of a record that does have an address. And in fact there was an example of this information in yesterday's Washington Post, if you happened to see this, of talking about Fairfax County, Virginia that has the highest average family income in the United States. And inside the article, they talked about the different lifestyle segmentation profiles that are represented by the people who live in Fairfax County. For example, they said 22 percent of the people who live in Fairfax County are in The Winner's Circle, that's the name of the profile, or Executive Suburban Families, age 35 to 64, household income is $90,700 a year, and these people are most likely to have a passport, shop at Ann Taylor and read Epicurean Magazine. So, this will give you a flavor of how this information is used to, again, help companies understand who their customers or their prospective customers are. A second source of information is surveys, such as warranty cards or marketing surveys that could include questions about what people's product preferences are across a whole range of different kinds of products, their life styles, their hobbies, and their demographics. The third way that the information can end up in a compiler's database is that people sign up for mailing lists, and I was thinking about this as I read the Sunday paper and, you know, there are cards that fall out of the Sunday magazine where you can request information on various topics. Or people who order things by mail, or you request information, call an 800 number, sign up for something online, enter a sweepstakes or a contest, and these types of things will put you on a mailing list. Well, mailing lists may be made available directly, without going through a compiler, either by the firm itself or more likely through a list broker who is going to manage the mailing list on behalf of the firm that owns the list. And that can end up with targeted offers to prospective customers. Or some of the information may end up in the compiler's database, and go into subsequent uses that we'll hear about. And then, finally, down at the bottom, we see the customer database, and when consumers establish a customer relationship with an organization, with a business, they end up in the customer database. And I think this is not a big surprise to everybody. And then that firm can generate new targeted offers to its current customers. I think people expect this to happen, but we're also going to hear how compilers can help these firms generate new offers to their customers, better target these offers and help these firms do cross marketing of new products and services. So now Johnny will talk about what goes on in the middle of the picture. MR. ANDERSON: Good morning. My name is Johnny Anderson, I'm Chief Executive at Hot Data. How Data is an infomediary that connects customer relationship management marketing automation systems to sources of both household information on consumers, and business information about businesses, and provides a complete set of data quality and standardization services for both small, medium and large-sized businesses. I'm going to spend a little time and talk about the kinds of information that's collected, how it gets compiled into a database, and then gets delivered into a marketer's, end user's, database. But first I want to kind of digress. I've looked at some of the other slide shows, and a lot of the topics are going to be hit. I really want to digress and talk about why people are -- why marketers are interested in this kind of information to begin with. Building a data warehouse and collecting this kind of information is a massive undertaking, and very expensive. What's the payback, and what are businesses looking for out of taking third party information and merging that in with their in-house information? If you think about commerce, if you think back, all the way back to the middle ages when commerce really first started. The buyers and sellers knew each other. There was a one-to-one relationship. Even up into the beginning of the last century, people knew -- the storekeepers knew who their customers were. After World War II and the mobilization of America, and the move from urban centers into suburban centers, and the creation of the now shopping mall, merchants now lost track of who their customers are. They don't know who buys products anymore. So, merchants really spend a lot of time doing product level analysis to figure out who bought the stinky cheese, and what stinky cheese purchases drove what other kind of purchases. The change in the new economy, and the evolution of the Internet now has really empowered consumers with information, and has broken down a lot of the geographic boundaries in terms of, I have to travel to a mall to purchase something. This has already been broken down quite a bit with the direct marketing and catalog industries, but now with the Internet, people now have a lot of information. So, it is now dependent on -- a business' dependence on success is now leveraged by what kind of service they can deliver. And to deliver that service, they again have to know who their customers are. So, you really look at all of the kinds of information that's available so that businesses can get a complete 360-degree view of their customers to be able to understand them not only in the context of their own transaction that may have taken place, but also what the likes and dislikes of that customer are. So, when you really look at the kind of information that's available, it really falls down into three categories. There's the geographic information, or where you live, and that kind of information is really address data, quality of the address, standardized to the Post Office's standards, what's the bar code for the address, but also includes information like what MSA that address is in, what census tract that address is in, and important things like latitude, longitude and geocoding, which are really used by businesses to do things like drive time analysis, and trade area analysis. But one of the first segmentations, at least in the retail industries, and now in the telecom industries, is where do you -- where do people live and how far are they likely to travel to get to one of my retail locations. The second is really the demographic information, and the collection and the detail of this will really be talked about a lot in panel number 2, but that's things like name, address and phone number, at a very basic level, but also reported and modeled information around a person's income level, what their marital status is, whether they buy by mail, whether they're a credit card user, whether they own their own home or not, information about what you're like. And then the third piece is really the psychographic information, and that's really what you like, what your life style indicators are, and that's where a lot of the compiled information comes in from, lists and surveys, to determine what somebody's propensity to buy a specific kind of product is. And those are indicators that could be that you're an outdoors enthusiast, a gardening book reader, dot, dot, dot, there are a number of different life style indicators. So, how is that information merged into one particular database? Data compilers really look to those three sources and do a very complex job of extraction, transformation and loading of that data. And that data is bought from public sources, and that could be things like tax records, home owner information, up until recently motor vehicle information was used, and in some states, even driver's license information. But that information is reported information that's public record that's brought into the database. Self reported data really drives a lot of the demographic and psychographics, and that's information from surveys and warranty cards and registrations. And then information from mail lists, and that is I'm -- I have a wooden boat, I subscribe to Wooden Boat Magazine. If I subscribe to Wooden Boat Magazine, there is a great likelihood that I am likely to buy products for wooden boats. So, affinity modeling and propensity scoring is really driven by the self-reported data from both subscriptions and product registrations. That information is matched based on name and address, so that there's really a view of a consumer that takes into account all of those different kinds of data sources. And then there's some additional modeling that's done on top of that, based on scientific samples and surveys, different kinds of models are put into place for specific vertical industries. Not every industry is interested in the same kind of consumer information. A telecom merchant is not interested in the same kind of information that a retailer is interested in. So, modeling is done based on a set of attributes that's been collected to be able to put together things for financial services and other industries. And then the output of that information really goes to two sources. One is the data enhancement source, in that I have a customer database of people that have come to my company from a number of different sources, could be a customer that signed up for a frequent buyer program at a retail location, could be a customer that's come to me at a trade show or sent back a business reply card, or a customer that's walked into one of my retail locations. The customer that's in my database, so I'm really looking for information that's outside my organization so I can understand that customer better. And the second is the targeted lists, and that is really if I've done some analysis in terms of what my best customer looks like, give me some more prospects that I can market that look just like those folks. I don't know who they are yet, and in most cases those targeted lists are going to go to a mail house who is going to get a mail drop, and I won't know who they are, until they respond to that direct mail campaign and come back into my database. And then they'll go into the normal process of my selling process inside my customer database. So, there will be a lot of detailed talk about both the collection of data in the second panel, and then the use and kind of how the technology drives some of the business models for the use of that data in the third panel a little it later on. So, with that, let me turn it over to Lynn, and let her talk about some of the internal uses of data. MS. CULNAN: Thank you, Johnny. MS. WUNDERMAN: Bear with me just one second here. Thank you. Well, I've been asked to spend the next 15 minutes talking to you about the end user applications that have evolved really over the last two to three decades, so it might be a little tight, but we're going to do the best we can. I'm going to start where Johnny left off, which is to help you understand how this kind of compiled data really brings a name and address record to life for a marketer. Now, this is a real, live consumer record off of a compiled database. I can attest to it because it's me, it's the Wunderman household at 94 Mercer Avenue in Hartsdale, New York. I have signed a release so that my data can be made public here today. But just from that information, we can now geocode this record and find out its census block group, attach all the geographic information available for the census, as well as we can now construct a match code, which you see here on the right side of the screen. That match code is the link to the compiled database by which we overlay the demographic and the psychographic information that Johnny was just earlier describing to you. Now, what happens when we do that? This is pretty much what you get, on the Wunderman household, a fairly distinct profile of a relatively affluent middle-aged, suburban couple, dotes on their dog, is extremely mail responsive, somewhat techno savvy and lives pretty much a high-end, fairly active life style. Now, I can tell you this is a pretty accurate record. There are two things they missed here. They missed the registration on my husband's antique motorcycle, okay. They are off by one category on our income; that's okay with me if it's okay with the IRS. But why do we want this data? Why do we want this information? As Johnny said before, it's not because we're being nosy, it's because we're looking to establish and build a relationship with a consumer. Now, Webster defines a relationship as a connection, a bonding or a contract, and the way we build relationships for marketing purposes is really no different than the way we establish and nurture relationships in real life. I mean, we do it through data, whether it's by factual information or observation, we're looking to establish some common ground by which we can create a meaningful, relevant communication to gain that connection. Now, I will tell you that the way it's done by general advertisers is different from the way we do it as direct marketers. In fact, it's the exact opposite. As a general advertiser, I'm looking for large numbers of people with something in common. Maybe I'm targeting women, 25 to 49, maybe some broad-based income qualifier. I'm going to talk to them based on what it is these women have in common. Or at least I think they have in common. Now, the issue is just because these are women largely of child-bearing age doesn't necessarily mean they have kids, but when I'm spending $7 to $10 a thousand to reach them on TV or maybe $20 to $30 a thousand to reach them in print, I can afford to have a certain amount of misses there. But it's very different when you're a direct marketer. I may be spending $500 or $1,000 a thousand to reach somebody at an individual or at a household level. So, I'm going to be much more stringent and rigorous when I look at and evaluate the success of that communication. I'm not looking for soft measures like awareness or reach and frequency, I'm looking for that household to take a specific action, and I'm going to valuate the cost efficiency of that action based on return on investment. So, I've got to be much more precise in my ability to target that household and develop a meaningful, relevant communication so I can capture their attention and do it quickly. So, we've learned over the years as direct marketers a very important principle over the years, and that is that people's differences are more important than their similarities. Now, what do I mean by that concept? I mean that what it is when you're studying a group that sets them apart from everybody else is more important than what it is that the people in that group have in common with each other. So, the differences are more important than their similarities, and they respond better when those differences are recognized. Now, here's what I mean by differences. It's all the data we've been talking about. It might be geographic, could be climate, market size, it might be demographic, life stage or life stage change, you know, maybe I just got a new spouse, got a new house, got a new baby, preferably in that order. It could be psychographic information, hobbies and interests we've been talking about, or it could be your purchase history. Now, we haven't talked a lot about that, but that purchase history could be self reported that I got off of some kind of a survey, or it could be the purchase history that a marketer captures and utilizes in their own database. And normally when we talk about this, we talk about the recency, the frequency, the monetary value segments as a marketer. And I will tell you this is incredibly powerful information from a segmentation standpoint. So, I might talk to you differently if you're a new customer versus a tenured customer. I'll not only talk to you differently, but I'll invest differentially if you're a high-value versus a low-value customer, and I'll have an entirely different contact strategy, frequency of the kind of offers I'm going to send you, if I happen to know that you're a loyal customer as opposed to a competitive switcher. Now, as I said, this behavioral information is incredibly important to marketers, and it works terrificly, if you have it. But you don't always have it. I mean, it's great if I'm talking to a group of customers that have been with me a long time and I have a lot of data on those people, it's an established product, it's a proven offer, but what do I do in a situation when I'm trying to attract new prospects into the base? I don't have a lot of data about their purchase behavior, particularly about what they're buying from my competitors. What about if I'm trying to spend on my new customers based on their potential to become high-value customers every time. Not much there in my database about these people. Or if I've got some test market results that I've done with new offers, new products, I know in aggregate how people are likely to respond, but I've got to think about who do I target with those offers because I don't have that response information on everybody in my database. So, what do we do? We use surrogate data. We use surrogate data as a bridge to help us be able to apply that behavioral information to another universe. Now, the most important data that we tend to use as surrogates is this compiled information we're talking about today, because there's a very important criterion that data has to be as available on the target audience that I'm studying as the application universe that I'm applying it to. And the compiled data is virtually available on just about every household in the U.S. So, what I am going to do is I am going to use my behavioral data in my own customer database to define a target. I'm then going to use the bridge data, the compiled data to describe the target and create a profile, and then I'm going to use that profile to help me find lookalikes in some larger application base. So, let me show you schematically how this works. I'm a marketer and I have defined a target as my high-value customers, however I define it, profits, revenues, purchase frequency, et cetera. And my goal is that I'm looking to identify prospects in the population who have a high potential to become high-value customers every time, I want to track them into my base. So what do I do? I'm going to study how do these high-value buyers look different from everybody else in the U.S.? And the data I'm going to use to do that is all the demographic information, the psychographic information, and I will tell you the coverage on the psychographics does not tend to be as large as some of the other data, so it doesn't often enter these statistical analyses, but we use it and we see if it's predictive. The geographic data and the census information, all to help me understand what is it about this group that makes it look different from everybody else. I'm going to overlay statistical tools so that I can really quantify which of these differences are statistically significant in identifying this target. I'm going to look at the interaction and the relative weight or strength of those variables, and I'm going to apply it back to a broader universe, in this case, the U.S. population. Every household gets this -- every household gets a score, excuse me, and the highest scores are the most likely to generate and to exhibit that target behavior. Those at the bottom are least likely to become your high-value customer, and this is nothing more than a planning tool. Okay, I'm going to penetrate that universe of U.S. population based on my volume objectives, my budget limitations, whatever. Now, I think it's important for you to understand as we talk about these concepts, where the predictive value of that data comes from. Okay, and I promise, no formulas, you don't need to be -- have a degree in applied statistics, it's a very simplistic example. I'm just going to use marital status and I'm only going to give it two values. So, here I am studying my high value-customers, all right, and I'm looking at them and I see well, big deal, they're just as likely to be married as they are to be single, that doesn't tell me much of anything, does it? How do I target anything based on this information, how do I talk to them based on this data? Well, guess what? I compared them to the U.S. population, and they're twice as likely to be single as the rest of the population at large. Now, take this predictive value, multiply it times another half dozen to a dozen variables, you start to see where the power of these statistical tools comes from. So, how do we use these tools? Well, we use them to help drive differential contact strategies. Who do we target, when do we target them, how do we target them so that we're more efficiently reaching them with more relevant communications across the entire life cycle of the customer. From acquisition to value stimulation, all the way to eventual retention and re-activation. So, for instance, I'm going to rank my customer database based on this information, and I'm going to spend differentially based on the probability of these people being high-value customers, the repeat sales, cost sale, up sale, I'm also going to apply it as well to my customer information applications. Maybe I'm even going to develop new services for high priority customers. I can overlay this data on any vertical or apply it out from a compiled database, I can use this for direct sale or regeneration offers. Also remember, that because this tool is developed at an individual household level, I can aggregate it back up to any level of geography. So, for local support programs where there's a retail trading area or there's a sales territory, it become a very useful tool to prioritize differential media and households for these purposes. It's easy to apply them to any form of addressable media, those that are available today, such as selective binding, addressable cable and satellite, some of the Internet applications you can hear about later this afternoon, and those that, you know, we've hardly thought about in the future, wireless, interactive television and things that haven't even been invented yet today. And these tools can also be used as a planning template, we can bridge them into syndicated research bases, such as Scarborough, MRI, Simmons, Nielsen, and help us optimize the value of our mass media, of our print and our broadcast spending. So, all of this is based on our study of a high potential end user. So, what does this do for us in the end? I mean, basically it helps marketers invest their marketing dollars smarter, more efficiently reaching customers across virtually every channel, and for consumers, it means hopefully you receive more of the offers you want, and fewer of the offers that you don't. And that to us is a win-win for everybody. Thank you. MS. CULNAN: We've got a lot of time left, we've got about 25 minutes. What would you like us to do? MS. ALLISON BROWN: Do you want to take questions? MS. CULNAN: Sure, we'll take questions. We changed our minds, we'll take some questions. And there's a microphone over here, so I think Jason Catlett has a question. And then if you would address your question to one of the panelists, if that's your preference, please do so. MR. CATLETT: May I address it to you, ma'am? MS. CULNAN: You may. MR. CATLIN: Hello, this is called the bleeding edge of technology. Well, I don't think it's doing anything, but I'm going to hold it here anyway. Mary, you said that you were not going to address co-op databases on the basis that there are so few of them. And I think that's like saying we're not going to address suppliers of Windows operating systems because there are so few of them. The dominant co-op database, Abacus Direct, really has enormous influence, and I think it's a model different to but very relevant here. So, could you take a minute to describe what co-op databases do? MS. CULNAN: I may punt this to one of the panelists who have more experience. I will say one thing, for those people that are interested in co-op databases, and particularly in Abacus Direct, their data dictionary is on the DoubleClick website, so if you go to doubleclick.com and you click on Abacus, you can see exactly what kind of information they have acquired, and I think probably it's a really good example of transparency, assuming you know to go there and look for the data. So, because Lynn is actually running a co-op database, and again, it's not that we didn't want to talk about these because we didn't want to hide anything, but because we were doing the broad overview, we decided as a panel it would confuse things, thinking our talks would take longer if we went off and then couldn't fit it all into the slide. MS. WUNDERMAN: I do promise that we will spend some time this afternoon talking about the co-op database model, and specifically about my company, I-Behavior, unless there's something specific to these applications that you would like to talk about now. I mean, I could go into the concept of co-op database, it's going to be a little redundant this afternoon. MR. CATLETT: Why don't you spend 30 seconds describing a co-op database. MS. WUNDERMAN: A co-op database is formed when marketers share their customer names and related buying information in order to gain access to names of qualified prospects as well as additional data on their customers that might otherwise be unavailable for them to market and to build their business. So, if we had, I don't know, Mary, if you could put back your first slide. MS. CULNAN: Sure. MS. WUNDERMAN: I mean, basically with a co-op database, if we move the consumer aside to the right and we were to create another box, what you would see is the customer databases, the compiled data would all come into a co-op database and we would have a consolidation of many customer files from marketers, publishers, catalogers, e-tailers, et cetera, all going into one database as well as it would be overlaid with the demographic or the psychographic as well as the census data we've been talking about earlier, all to form a positive record. And that is the rich behavioral and demographic base upon which marketers would be able to do selections from that file. MR. CATLETT: Thank you. MS. CULNAN: One difference I think it's important to point out, you have to be a partner in the co-op database. MS. WUNDERMAN: Yes, you do. MS. CULNAN: You have to put data in in order to take advantage of the data that's there, as opposed to the compiled databases where basically there's no relationship between contributing data to the database and being able to acquire data from the compiler. MS. WUNDERMAN: Yes, and I will also say that generally that there's notification to the consumer about sharing data with trusted third parties as well as the online component, there are privacy protections as well. MS. CULNAN: Anybody else? There's a question toward the back. MR. TUROW: Would you talk just a little bit about the way databases get purged, based not just on what consumers want, but also recency and the decision that certain things become obsolete and how those criteria are determined? MS. WUNDERMAN: I want to make sure that I understand your question. You're asking, you know, I think on -- in terms of if I have information in a customer database about an individual's purchase behavior and over time that that data is no longer relevant? Is that -- MR. TUROW: Yeah, how do you decide -- how do you decide at what point you purge those particular data like your sports car. Maybe you decided to get more conservative about the car and somebody has not picked it up, do you have any kind of criteria to which to purge certain kinds of data after a certain amount of time, based on certain other criteria? MS. WUNDERMAN: Let me say something about the compiled data and its value, because they're not going to be always 100 percent accurate. I mean, you saw even my income on my own personal record was not accurate. What's of greatest value with the compiled data beyond its coverage is its consistency, and when you're looking for predictive value, consistency can be even more important than sheer accuracy. So, the procedures that are in place to replace that information, the models that are done to calculate data such as income, it's consistently done even if it's inconsistent across households. So that as that data is predictive, it may be predictive, even though it's not 100 percent accurate, but if it is predictive, it will rise to the top, and then virtually it's a numbers game. You will never be 100 percent on any particular individual or household. What you're trying to do is increase the probability of identifying a high potential consumer. So, for one or two or, you know, any number of people, that data will still not be 100 percent accurate, it ages over time, and it's the compilers that capture that information from the various and sundry public resources or surveys that gets supplied back to us, it's accurate, it's not accurate. But if it's still predictive, we will still work with that information. MR. SMITH: Richard Smith with Privacy Foundation. I have a question for Lynn. How do I get my compiled record, just like you got yours, on the screen? MS. WUNDERMAN: Call me. MR. SMITH: Can everybody call you if they want to see, every consumer if they want to see this? MS. WUNDERMAN: I'm sorry, you're asking you as a consumer, how would you get access to information? Well, I am not a data compiler, per se, I mean we get our data from Equifax, there are others, Experian, and First USA through their Donelly unit and Acxiom through their InFobase that supply this information, but if you as a consumer are interested in seeing your record on our database, you can request a copy of your profile and we'll supply it. MR. SMITH: Do these companies, compiler companies generally allow consumers to look at this kind of data? MS. WUNDERMAN: You know, I not being a compiler. I would have to say in today's marketing environment, they should, but I cannot tell you. Certainly the data that comes, for instance, from a credit bureau, and the credit bureau information gets channeled as part of Equifax and that gets channeled into the Polk Database, as a credit bureau, you need to be able to provide consumers with access to that data, but I'm not familiar with the policies of each and every compiler. MR. SMITH: Thank you. MS. CULNAN: Okay, I think we're going to take a break and you want to break for -- you're going to let the people running this set the rules. Thank you. MR. WINSTON: This is kind of a unique situation, we're actually ending a little early, but that gives us a little more time for lunch. So, if we could break until about 10:15, and I want to thank the panelists and the Magazine Publishers of America. (Applause.) MR. WINSTON: Also, thank you to the Magazine Publishers of America for supplying our repast out there. (Pause in the proceedings.) SESSION TWO: CONSUMER DATA: WHAT IS IT? WHERE DOES IT COME FROM? MS. ALLISON BROWN: Hi, I'm Allison Brown, I'm an attorney in the FTC's Bureau of Consumer Protection, and I'll be the moderator for Session 2, entitled Consumer Data: What Is It? Where Does It Come From? The overview that we just heard has provided us with a brief look at data merger and exchange. Now we will begin a series of in-depth panel discussions about these practices. This panel discussion will focus on the original sources of consumer information, and we have five very experienced and knowledgable panelists with us today for the discussion. We will also have about ten minutes at the end of the panel for the audience to ask questions. If you're sitting in an overflow room and you want to ask a question, please come up to the doorway on the main room here on the fourth floor at about 11:20 and we'll have a wireless microphone here so that you will be able to ask the panelists your questions. I will now introduce each person on the panel and ask the panelist to spend about three minutes to provide a brief introduction to the sources of consumer data that businesses use. C. Win Billingsley is the Chief Privacy Officer of Naviant, Inc. Naviant is a provider of marketing tools and integration methodology for online and offline environments. Win, please go ahead with your introductory remarks now and I'll introduce the other panelists in turn. MR. BILLINGSLEY: Okay. Naviant is a leading provider of integrated precision marketing tools, for both online and offline environments. So, we really integrate the virtual world with the physical world. This capability enables marketers to identify, reach and build relationships with online consumers. So, to probably state that in a form that is more meaningful to you, Naviant has a database of about 30 million households that are Internet-enabled. So, our niche is a database of people who have the capability to buy products and services on the Internet. This data is collected primarily through product registration data, and we'll talk a little bit more about that in the session on how this actually occurs. The data is fully permissioned. We only want people in our marketing database that permission us to do so. You know, an individual or an Internet user that does not want to participate in Naviant's database is not included in the database. And then there are other processes that we have in place to make sure that our data is accurate and as useful as possible. MS. ALLISON BROWN: Okay, Elisabeth Brown is Senior Vice President of Product Strategy for Claritas. Ms. Brown oversees the development of new data products and services, including demographic, cartographic and segmentation systems, and the management of the software and applications that are delivered to Claritas clients. Ms. Brown? MS. ELISABETH BROWN: Thank you. One comment, too, I have actually been not only am I a member of the club, but I have been a client, so I was actually a client of the Claritas marketing products and services before I joined the company. So, I do have a little bit of perspective on how it can be used and how we used it when I was at the Prudential Insurance Company. Claritas is a marketing information company that has been in business for over 30 years, which makes us one of the more mature companies in this industry -- as evidenced by a recent Wall Street Journal article that referred to Claritas as the granddaddy of demographic providers. Claritas serves companies in financial services, telecommunications, energy, automotive, retail, restaurant and real estate industries, and we have clients ranging from the top Fortune 500 companies to small, independent consultants. I'll just give you a little bit of background. Over 30 years ago, Claritas' founder, Jonathan Robbin, who was a Harvard social scientist, was analyzing U.S. Census data and settlement patterns. He hypothesized that American neighborhoods reflected the old adage that birds of a feather flock together, and therefore, the products and services that Americans consumed could be predicted simply by knowing summary level demographic information about the area, or "you are where you live." This was referred to in the first slide as geodemography. Thirty years later, our models have become more sophisticated and are able to dissect markets at a much lower level of geography, but that same old basic premise still holds true that by knowing some small amount of demographic information, you can infer or predict the likelihood that a household will be interested in the products and services that you're offering. So, we provide demographics and other consumer and business data on multiple levels of geography, delivered through our various mapping and marketing application software platforms. We are probably most well known for our consumer segmentation systems, for example, Prism, which was also identified earlier when Mary was speaking about Winner's Circle and what some of the attributes of a neighborhood would be that would be tagged as Winner's Circle across the country. Our consumer product demand estimates that our clients use to more efficiently market their targeted customers and prospects, which you could refer to as surrogate or inferred data. Claritas data and services are used for broad marketing functions such as tracking new customers, retaining current customers, determining site locations and appropriate sales and marketing distribution channels, and we help with more efficient reach strategies and media planning. So, basically, Claritas marketing information helps our clients offer the right products and services in the most appealing way to the consumers and prospects. We provide basically the benchmark information or the total universe data that our customers can use to compare their current customers and markets against so that they can make better marketing decisions. Thank you. MS. ALLISON BROWN: Next we have Paula Bruening who is Staff Counsel for the Center for Democracy and Technology. The Center for Democracy and Technology is a non-profit public interest organization that seeks practical solutions for enhanced free expression and privacy in global communications technologies. MS. BRUENING: Thank you. CDT has been asked today to discuss the issue of public records as a source of information about individuals from a factual basis, and as many of you know, CDT generally has a specific viewpoint on this issue. I will talk today about the factual basis in my opening remarks and then any other comments will be reserved for the Q&A, but I would like to encourage the FTC to go to the state level and to some other resources and some organizations that are doing work on this issue, because I think some of the really difficult work on how the information is collected and how it is being used specifically is being done at the state level. And I'm happy to give the FTC that information. Public records maintained by government agencies disclose a vast array of detail about an individual's life, activities and personal characteristics. At the federal level, most personal information is not available to the public, because of the privacy exemption in the Freedom of Information Act and the Privacy Act of 1974. However, bankruptcy records are an important exception to this rule and are maintained by the federal courts. These records are a source of detailed financial information, and the sensitivity of that information has been recognized by the Office of Management and Budget, which has produced a study on this issue called Financial Privacy in Bankruptcy: A Case Study on Privacy in Public and Judicial Records. At the state and local level, however, the types of records that are maintained are different, and the laws and policies governing records yield disparate acts and disclosure practices, but it is possible to construct a detailed profile about an individual from public records. And while I will spare all of you the exhaustive list of all the sources of information, I'll name a few: Name and address information come from voting records; land titles are a source of home ownership information; property taxes can give you assessed value of homes; birth and death records give you information about an individual's parents. The list goes on, there are occupational license records, motor vehicle records that can tell you about an individual's make and model of an automobile, voter registration gives you party political affiliation, and hunting and fishing licenses, boat and airplane licenses can give you information about how a person likes to spend their leisure time. There may be considerably more information available in public records about an individual who has interacted with the courts as a criminal defendant, as a plaintiff or defendant in a civil litigation, in a divorce proceeding, as a juror, as the beneficiary of a will. Public access to government records serves several important goals. Individuals need government information to make political decisions about government programs, legislative and regulatory options, and candidates running for office. Government records also assure the accountability of individuals as in the case of business and real estate transactions. However, it's important that public record information be used for the reasons it was collected. This information was not meant to be searchable in a database, nor was it intended to be used in marketing. And simply because there is a tradition of collection of information, important decisions need to be made on a case-by-case basis about the appropriateness of access to public records and the role of consumer choice. MS. ALLISON BROWN: Thank you. Michael Pashby is Executive Vice President and General Manager for Magazine Publishers of America where he has also served as Executive Vice President of Consumer Marketing. Before joining the MPA, Mr. Pashby was president and publisher of Art and Antiques Magazine, vice president of circulation and new product development for Gruner + Jahr USA, and Managing Director of U.S. Operations for Marshall Cavendish. Michael? MR. PASHBY: Thank you. That sounded impressive. MPA represents about 85 percent of the consumer magazine -- dollar volume of the consumer magazine industry in this country, and about 85 percent of all magazines are sold through the mails, using direct mailing techniques or direct marketing techniques of extremely varying sophistication. The use of credit cards in our industry is extremely small, but is now growing. Our members strongly agree that we must protect the privacy of our readers, and I think our industry has done a very good job over the years in balancing our legitimate business interests and our consumers' reasonable expectations of privacy. Obviously we value our readers and we wouldn't be in business without them, so our industry is constantly looking for ways to improve that service to our readers. It's important to note that when our readers ask us not to share information about them, we don't. In the information section of most magazines, the publisher discloses that the subscription list may be rented to appropriate businesses. The magazine offers an address or toll free number so that the reader can opt out. And many magazines are taking advantage of the Internet to inform consumers of their privacy policies, and give consumers an additional opportunity to opt out. We're very careful with respect to the customers, to the wishes of the customers who choose to opt out. Generally when a consumer requests that publishers not share information, that publisher will not only remove the consumer from their own internal rental lists, but will refer the consumer to the DMA so that the consumer can request to be on their nation-wide do-not-mail list. That said, magazines are very good sources for consumer data. And the reason is very simple. More than any other medium, the choice of which magazines a consumer reads can tell a lot about a person, what a person likes, and his or her interests. In enabling our readers to get information about products and services that are of interest to them, it is advantageous to everyone. Our readers are given more choices, they get information about products of their interest and life styles, and most importantly they're not inundated with advertisements for products they have no interest in. Businesses benefit because they can target their advertising to consumers who are most likely to be interested in their products, saving them time and money. And for magazines, with a cost of mailing now between 65 cents and a dollar per piece, and that's before the Post Office applies for its newest rate increase this June, the cost of acquiring a consumer, when the response rates are in the low single digits, and in a very competitive market, is extremely expensive. But sharing information only works if it's beneficial to everyone. Our magazine subscriber lists are our most important and valuable assets, our readers do not want to get advertisements for products they don't care about, so the magazine industry is selective about letting advertisers use their lists. If a business intends to mail a solicitation to a consumer, magazine staff review that promotion to ensure its use is appropriate. Most magazine publishers will not rent their list to telemarketers because they have little control over how the list is used, but if lists are rented, we expect magazine staff to review the telemarketing script. And very importantly, the list is rented, it's not sold. That means the advertiser can use it only one time. And publishers, as a general course, see their lists and track how that list is used. Thank you for inviting us again. MS. ALLISON BROWN: Thank you. Our final panelist is Ted Wham. Ted is the President of Database Marketing for the Internet, a sole proprietorship consulting practice. His career has been concentrated in the direct and database marketing industries, focusing most recently on Internet-enabled marketing applications. Ted? MR. WHAM: The benefit of having the last name of Wham is that although I am always at the end of the line, I always get to hear what everybody says before me and tailor my comments to help amplify on those areas as well. Database Marketing is an independent consultancy that consists of myself as an independent business person working out of my home, and billing my cat at very low billable rates, I have had an opportunity to work with organizations such as Viacom Division, Curriculum Corporation, Hewlett Packard, I have worked with Cisco Systems here recently, NCR and so forth, helping them formulate Internet privacy strategies and also how to use information about consumers for part of their contact strategies. In general, the information which is available about consumers in the United States starts from very gross aggregate levels, compiled information which is largely demographic information, and as Ms. Wunderman explained in the session immediately before this one, to a lesser extent psychographic information. You move from that into information which is available from a wide range of public records, such as the ones that Ms. Bruening referred to, and ones that I have personal experience with as being on the receiving side of some of the solicitations for there. That's important because those public records the consumer doesn't have much choice in terms of their participation in those lists, it's an obligatory process. If I want to vote, I have to register to vote, and if I register to vote, those public records are then going to be available for purposes unrelated to my voting, and, you know, that's kind of the way it is. There is then a second tier, and that is government supported monopolies, and those monopolies are, because they're either a natural monopoly such as the provision of your gas service or your telephone service, and for instance white pages, telephone white pages are a major source of compiled list information, but there's also government supported monopolies in the form of patent protection and copyright protection, which gives a form of a unique ability to sell a product. So, for instance, if I want to operate with a computer operating system called Windows, I have to support the patent and copyright protections available from Microsoft until those patents run out, and I have to use that information and Microsoft has that and has the opportunity to share that information, if that is their business practice to do so. There is a whole range of different products from drugs that you have to take to the type of services that you buy and so forth, where that government-mandated protection is there. For monopolistic practice it serves a public good in terms of inspiring innovation. The last area is information which is in a much more competitive area. I can go to any of a number of different retailers to buy clothing, for instance, and the retailers when I make that purchase are going to collect various amounts of information. So, if I buy at Sears, that may be a largely anonymous transaction, especially if I make it in a cash basis. If I do it through a credit card, they may have more information, and some retailers through a traditional retail environment such as Radio Shack actually will ask you for information about your name and address, and collect that information online. Other businesses who run their business model through a mail order process such as Lands End and J. Crew and so forth become much, much more adept at collecting very specific information about you because what you've bought in the past becomes most predictive about what you will buy in the future. It's dramatically better than demographic information, dramatically better than any information you're going to get from public records. If I bought something from J. Crew in the past, I will be better than any prospect that they can find to buy stuff from them in the future. But there's an opportunity for a consumer to make a choice in those purchases on whether they're going to choose retailer A versus retailer B, and so there's an opportunity for control there. So, in looking at this, I think it's important to look at the spectrum of how that information is collected in terms of the consumer's ability to control the use of that information downstream. MS. ALLISON BROWN: Now that you've heard a brief introduction to the sources of consumer data that businesses use, I'm going to ask our panelists some questions so that we can learn some more specifics. Win, what data elements does your business collect about consumers and how do you collect the information? MR. BILLINGSLEY: Most of us have done a product registration or a software application registration, and it's very important for the manufacturer of that product to get to know who their end user customers are, because all of them distribute their products and services through some intermediary. So, they're really isolated from who their end user customers are. The way they try to solve that problem, and also to provide customer support and service, is through a registration process. So, Naviant provides software that is used by companies that manufacture computer hardware and software products to facilitate that registration. So, the data that we collect for the company includes all the information that we've all seen on those product registration forms, but the only data that Naviant really uses that goes forward into a marketing database is the name and the address, and the fact that this is an Internet-enabled household. And that's really what we focus on and what we collect. The other information is analyzed statistically and then passed back to the manufacturer, and they can use it for various business purposes to know who their customers are. So, name and address, and the fact that this individual is Internet-enabled is key to our -- that's where the cycle starts with Naviant. MS. ALLISON BROWN: What other data elements do businesses collect about consumers and how are they collected? Anybody? You can just either raise your hand or put your tent card on its side? Ted? MR. WHAM: Yeah, I forgot the tent card on its side, I don't live in Washington, D.C. That's a rule. Businesses often times have an insatiable demand for information. They would collect as much information as the consumer will spend time to provide for them. In fact, one of the services that I provide to my consulting clients is that I will get the question, How much can we ask on a registration process or in a survey process or through a purchasing application before the consumer is finally going to go Aye, "I don't want to do this anymore" and will bottom out of that, and they will test that very aggressively and try several different formats. If we ask this extra question, what's going to happen here? If I format this as a drop-down question instead of a radio button, what happens here and so forth. They will collect as much information as they can until they reach a point where the collection of that information degrades completion of the desired task. MS. ALLISON BROWN: Betsy? MS. ELISABETH BROWN: One of the things that I didn't go over specifically is that there are lots of sources of public information out there, including the U.S. Census data, which is pretty hot right now since it's been recently updated. Many companies are trying to get at this information because it's a very good source for benchmark information to understand sort of the lay of the land. And when we talk about benchmark information, there's a lot of other domain information, public domain information that is also collected and used by businesses. Just from my experience at Claritas and my experience with some of these customers, they really do use a variety of information for different business purposes, and from what we've seen, we -- at Claritas, we try to assist them by updating the demographic information annually so they do have these benchmarks and we use lots of different input sources, including consumer surveys that are out there, you may have heard of people like Simmons Market Research Bureau, Mediamark, Nielsen Net Ratings, Scarborough, all of these are collected with consumer consent, they're pretty much anonymized in terms of you never really know who these individual consumers are. Basically that data is used and compiled and turned into models that really say if the person is in this demographic characteristic, they have a higher likelihood than average to do these behaviors. Some of the magazine data is used that way as well. You can either use the individual registration data or pretty much the anonymized version which gives you the, quote, profile. So, there are many, many databases that Claritas and other companies produce and put out there, and the only way that information is linked back to a customer record is through an inferred modeling process, which either takes into account what we believe their demographics to be, or something as simple as the zip code or zip plus four in which they live. MS. ALLISON BROWN: And can you be a little more specific about the types of information that Claritas gets from surveys, you know, either through Simmons or through its own surveys? MS. ELISABETH BROWN: Depending on the panel, Simmons and Mediamark Research have various surveys that they put out there, some of them are books of information that ask everything from how much peanut butter do you eat a week, to what brands do you prefer, what media you like, how often do you spend in front of the television. A.C. Nielsen actually captures specific readership and views of which television programs and what day parts in terms of which actual physical programs you're watching. And a lot of that data, again, it's all consumers are signing up for these panels. That's the panel type of research. In addition, there's other types of research which is more of the research where you're calling up people on the telephone or just sending them a direct mail package and asking them something more specific about the financial services that they're using, or the types of Internet services they have and that type of nature. Once again, most of this data, what happens is that all the data is collected at a household level, but when it's modeled and analyzed, it's analyzed in terms of demographic characteristics or segmentation codes and not -- those people that participate in the panel, that data is never used for specific marketing purposes back to those individuals. MS. ALLISON BROWN: Thank you. Paula? MS. BRUENING: Yes, I just wanted to talk a little bit about business use of public record information, and clearly the kinds of information that I talked about in my opening remarks are valuable to businesses in their marketing pursuits. The problem comes with the fact that the information has been given up by the individual, is given up so that they can participate, as Ted Wham said, in some very basic functions of life. They want to drive a car, they want to buy a house. They've had a baby. Someone's been born or died in the family. Someone's received money in a will. And I think that to say that Well, that's being used for other purposes, and that's just the way it is, I think is a -- is not a really very thorough analysis. I think that if anything, what the information age, computerization, will allow us to do is give us an opportunity to re-examine those uses to decide whether those are appropriate, whether we can limit the access to that information, to the -- to something closer to what the initial collection was intended for. MS. ALLISON BROWN: Are there currently any restrictions on the use of public record data for marketing? Anybody? MR. WHAM: There's one large restriction that I am familiar with and that is recently there was legislation passed at the federal level which gives consumers an opportunity to opt out of having their information about their automobile registration used for marketing purposes. MS. BRUENING: That's opt in. MR. WHAM: Opt in, opt out, excuse me, okay. So, but it was very, very significant, because prior to that legislation 46 of 50 states made their consumer automobile registration information available to the list rental marketplace, and what type of car you own and drive is extremely predictive of your household income. It's one of the most predictive items. And so if I wanted to drive a car in the state of California, I didn't have any choice, that information was going to make it into R. L. Polk's database. That's an example where there have been some restrictions recently. MS. ALLISON BROWN: Michael, I think you've been wanting to say something? MR. PASHBY: I was just going to say the magazines themselves collect a relatively small amount of information about their consumers. The sort of information that they have is the date of purchase, the source of purchase, whether it's by the telephone or from a magazine previously bought, whether it's through direct mail. The number of times they've purchased, the value of the purchase. That's the basic information that a single magazine would have, that information can become more valuable if you're a multimagazine publisher or you have other lines of publishing so you can then create a broader profile of the person if they're also buying books or magazines in different interests. But the interesting thing about magazines, is that on a -- say a broad interest magazine, one of the seven sisters, when a publisher is trying to promote to the consumer, probably the most useful type of information that the publisher will have is cluster information. If a person is of a certain age and lives in a certain area, that their neighbors may be likely to buy the same magazine. The more specialized you get in a magazine, let's take a woodworking magazine, just because a person lives next door to someone who buys a woodworking magazine, there is absolutely no reason to suppose that the other person would want to buy one. So, the use of the use of data for the small -- the small publisher, the small business, is becoming far more important. We used to have something, until a couple of years ago, called Publishers Clearinghouse and American Family Publishers, which mailed into every household in the country, and the consumer could self select their magazines. Nowadays, those mailings are a thing of the past. And information to a publisher has become far more important, to be able to target their consumers. MS. ALLISON BROWN: Betsy? MS. ELISABETH BROWN: There are fairly significant restrictions on credit card information and data that's used to actually make specific financial offers, from the list compiler companies, like Equifax and Experian. And although I don't represent those companies, I'm not well versed in specifically what those criteria are, the financial services companies that we've worked with, they can only use certain information if they're actually making a credit offer, where they are willing to do a pre-approved credit offer, which means that they are going to say because I have pulled this information on you, I'm willing to say that I will guarantee that if I make this offer, you can have this product. And that data cannot be used by another portion of the bank to make another type of offer, whether or not extending credit. So, those protections are in place, I don't have all the details about all the specifics, but it's important to know that they're out there. MS. ALLISON BROWN: Right, and the FTC is very familiar with the Fair Credit Reporting Act and the restrictions on credit data, so that's useful to know, although we are focusing here on data that's not being used for credit decisions. Paula? MS. BRUENING: Yes, I just wanted to go back to the Driver's Privacy Protection Act. I think that that piece of legislation really reflects heightened consumer concern about the incompatible use of this public record information, and it is a response to that. And I think what it does is really offer to individuals who are participating in these basic life experiences, the same kinds of choice that we have come to expect in the commercial realm. We require notice and choice when we're doing business now with a website, or with an organization, and something -- legislation like the Driver's Privacy Protection Act offers that same kind of consumer choice, which I think is critical here. MS. ALLISON BROWN: Ted? MR. WHAM: Just a couple of concepts I would like to throw out there, and I would like to pierce a couple of notions about what's happening with data out there. There is certainly data just being collected in a permissioned basis. There is also certainly information which is being collected which is not personally identifiable and is going through a more of an aggregation, a blending type of a process. Ms. Brown talked about some of the practices of Claritas, and Claritas uses largely, if not exclusively, nonpersonally identifiable information available from census tract records from U.S. Government surveys through the census process, but there's an immense amount of data which is collected which is not permissioned in any way, so the consumer is not being asked whether it is okay for that information to be shared with third parties, and there's an immense amount of information which is available that is, you know, personally identifiable and shared with third parties quite readily. So, I would have you think, we have an especially erudite audience in terms of knowing how this process works, although we're all here in this workshop, I think a lot of us have an understanding walking in the door how this process works. But if you thought back to your five most recent purchases, I would suspect that there are very few of us in this room who would know whether the companies with whom they did that transaction have a process of sharing that information with third parties, okay? So, you know, think about what you've purchased most recently, and there are many, many companies who the difference between profit and loss for those companies is made by selling their customer information to noncompetitive businesses who are going to be targeting the same type of business. So, if I'm buying a computer peripheral and it's for an obscure, you know, system, other customers that sell computer peripherals to that same obscure system in a noncompetitive way, can almost invariably buy that information. And the best example that I can give of that is the Bible for mailing lists in the United States, the Standard Rates and Data System, SRDS. I have a friend who is a list compiler, and before this session, I called her and I said, How many pages is that book these days? And the current volume exceeds 3,500 pages. Something on the order of 100,000 distinct mailing lists are available for rental in the United States. Most of those, the majority of those, with distinct personally identifiable information in them. MS. ALLISON BROWN: Win? MR. BILLINGSLEY: I would just like to make one other point and discuss an anomaly that we face in our data collection process, in processing warranty information. Some of that data is collected via a web browser technology, fully Internet-based, and clearly when you collect data using that methodology, it comes under the fair information principles of notice, choice, access, security and enforcement, but there is also a large portion of that data that's not collected using browser-based technology. It's collected using a dial-up, a synchronous modem capability with an application that is loaded in the PC. So, some people would make the contention that since you're not on the Internet, that is offline data. Now, you know, we have struggled with how to deal with that issue, and the way we resolve it in Naviant is we treat data collected by either one of those two methods by the more rigorous online marketing data collection rules, but it is an anomaly that I think should be addressed so that there is clarity provided in how people that try to collect data in an ethical and permissioned way, how they really should operate when they face these kinds of dilemmas. MS. ALLISON BROWN: I do want to go back to some of the specifics about the data that are being collected here. Betsy, you've talked a little bit about census blocks, zip code information, and zip plus four information. Can you give us a sense of how many households are in a census block, versus a zip code block, versus a zip plus four? MS. ELISABETH BROWN: Yes, a zip plus four would probably be the lowest level of geography, not even geography, because there aren't boundaries, but the lowest level at which you can compile information that's not at household level. And generally a zip plus four can have anywhere from four to ten households in it. Most of the zip plus four data that gets compiled, they have factors in there whereas if there isn't enough information for a particular variable, that is data-filled so that you don't have any privacy issues. The next level up, a block or block group tends to have anywhere from 250 to 350 households. Zip codes can have anywhere from a few thousand to 25,000. They're not really cohesive types of geographies. And census tracks are anywhere from 1,200 and up. So, low enough levels of geography so that if you're a broad, when you're looking at some of the broad applications that we're talking about, when companies are just trying to understand the lay of the land, for example, generally zip codes, counties, census tracts are a good way for them to really understand what's going on in a marketplace, if they want to enter the marketplace or not. And what we see is that there's different levels of using some of these data. A lot of the clients that we deal with will use a lot of this information for more of their strategic marketing purposes, and when they go out to actually implement a program, they will buy a direct mail list. The attributes that they use to understand their total marketplace may be different than they actually use on the implemented direct mail list. And I think Lynn went over that a little bit, which is that what you'll find is that just because they know that a certain demographic characteristic is currently their, quote, best customer, when they actually go to pull the mailing list, there are many different market -- let's say environments that will cause them to maybe change a specific type of demographic that they're going after, or they'll look at a list and they'll find that the people that they most want to attract, let's say for private banking, are not direct marketing type of customers, that they really aren't going to reach them through a direct marketing list. They don't exist much on the list, there isn't enough data on them and they're not really responsive to the list. So, I think that sometimes people believe that these companies have an enormous amount of information, which they do, but in their practice of actually rolling out marketing programs, it's not as succinct as you might think it is, that they know exactly who their targets are and they can then implement against those targets. They have to really use a lot of strategy and analysis to just try to reach the right person. I don't know if that's a -- there's just a lot of different ways you can use that type of information. So, you can move from these geographic levels down to the household level, but you may not have an exact fit when you do that. MS. ALLISON BROWN: And we heard a little bit in the overview about how businesses append data from third party databases. Can anybody give any specific examples of what types of data businesses append to their in-house customer files? Win? MR. BILLINGSLEY: Well, just having a name and address and a flag that says you're an Internet household is not a very effective product in terms of providing marketing lists. So, that base core of information is used to do a match with various data compilers and aggregators of information, and then we ingest certain attributes that are associated with that name and address. And some of those attributes -- and there's many -- but it would be things like income range, age range, gender, hobbies, interests, things of that nature, that we use to embellish the marketing file so that we can do selects and generate lists that are targeted for specific products and services. MS. ALLISON BROWN: Does anybody want to add to that? Michael? MR. PASHBY: Generally magazines will append information slightly differently, depending on the type of magazine. A general magazine will probably append more information or have the ability to append more information. I mean, clearly, the very basic information of age, income, family size, gender, is generally available to be appended to the -- to that list, but the more general the magazine, probably the more selections that will be made available. There are a number of companies which will take a magazine list and add information to it, creating that database, and the sort of information that can be appended is everything that's being talked about today. Whether it be the types of cars that people own, when they bought a car, the type of house, the value of the house. There's a lot of information that can be appended, but in general, magazines tend to be the starting -- the starting place rather than the end, with all that information appended to it, because they start -- you're starting with the general interest area, and then it is merged and purged with other lists during the marketing process. MS. ALLISON BROWN: Thanks. Ted? MR. WHAM: A very typical use of appended information is to take a large universe file of all your customers and presume you're a cataloguing business that has, you know, for conversation's sake, a million customers that have done business with you over time. You take a statistically representative sample of that, of perhaps 10,000 individuals and you go and append absolutely everything to those 10,000 people you can possibly get our your hands on, from income, age, whether they've got children, the age of those children, whether they're grandparents, the type of interests that they have, all of the psychographic information, everything you can get to that. And then you run that against statistical processes and say, Okay, tell me of all of these different processes, which one of these are going to be predictive of the ones I care about most. And as Ms. Wunderman pointed out this morning, different businesses care about different things. Some businesses want lots of transactions, some businesses need to be very concerned about turnover, loss of the customers, some long distance carriers and cellular phone carriers, for instance, are extremely interested to make certain that they're getting customers who are going to stick with them and are not switchers and so forth. And it varies by businesses. Once they identify which of those characteristics are particularly predictive for the customers that they want, they will then go to the remaining universe, those 990,000 names that they never did anything with, and they'll go back to the original appending firm and say, Please append these two or three variables that I want. Much more cost effective than appending all 30 or 50 or 150 variables to the entire universe if only three of those are going to be productive for what you're trying to do. MS. ALLISON BROWN: Betsy? MS. ELISABETH BROWN: Yeah, that's a very good point. I think one of the reasons that Claritas has been in business for 30 years is that one of the things that we have been able to do is boil down a lot of those characteristics into segment codes, which makes it a lot easier. I mean, we have seen in the financial services arena about ten years ago, they were one of the first industries to really take customer file records that they have done, they have a very -- financial institutions tend to have a very strong relationship, we talked about what a relationship was, with their clients. There's a lot of trust there that the clients are giving a lot of very in-depth financial information to these companies. Financial services companies are fairly conservative from what we've seen with what they do with the collected information, but in addition, they didn't really have the databases and the software capability to manipulate these gigantic files with so much information that they collect, nor did they have a good way of updating them. So, even with them collecting all of this very personal information, they tended to use companies like Claritas to help them boil it down and understand from a one code type of an aspect what can we know about these people quickly and easily without having to look at 100 or 200 different variables that we've collected over time. So, that's sort of in essence what a cluster code is. The basic information we really need there is just an address that will allow you to say the likelihood is that these people live in an upscale suburban neighborhood or an upscale urban neighborhood. And a real quick example of how that would be used would be if you knew -- if you just had straight demographics on someone and you knew you had two males, 30 years old, and you figured out that they make about $50,000, do they need individual life insurance or not. Not quite enough information for you to make a decision on that, one male might be single, doesn't own a home, doesn't really have any dependents, where the other male might have a family with three kids, a house, a mortgage, so having a little bit more rich information on that would make you look at these two similar demographics and say I'm going to offer insurance to the one because they are going to need it and not the other. Or another quick use is if they're only using their internal data and they know that they have got a thousand people who have $5,000 in their checking account and always have had $5,000 in their checking account, by overlaying some of these segment codes, you can get a quick idea that five of those people, that's all they're really ever going to have in demand deposits at a bank, that's really all they're qualified to have, and this segment code would be something like a number, 27, that would represent a string of demographics that would predict that that person is probably in that demographic. And you might find out that half of these people have a very high likelihood for using a loan product. So, if you wanted to offer them another service, you would be better off offering them a loan product than the other half who you would be better off offering an investment product. So, without having to know a ton of personal information, you can at least make some good guesses as to what the next most likely product is to offer those people. MS. ALLISON BROWN: And can you give us a couple of more examples of the segments, I think that Mary in the overview gave us a couple from a newspaper article, I think people might be interested to hear what some of the other ones are and how many there are as well. MS. ELISABETH BROWN: Well, we have -- there are several different segmentation systems, and a segmentation system really starts off as just a predictive model. So, as Ms. Wunderman was saying earlier in the session, different industries care about different data. So, a very generic model would be something like our Prism segmentation system that's based on the demographics of where you've settled, where you live, there are several more like that out there in the public domain, and they have -- some of them have nicknames, they tend to be sort of upscale suburban, like Blueblood Estates, Urban Singles, Upscale Urban Singles, Midscale, you know, Urban Dense Areas. So, there's lots of different ways that you can just get a quick snapshot of what the settlement patterns are in that neighborhood. And one of the things that we've -- because these things, as everyone said, as I think Paula was saying earlier, there's different uses for that. It's important to know that you're in a suburban market area if you're trying to sell lawn mowers. You certainly don't want to be offering that to urban upscale singles in high rises. So, some of the data is critically important to some of the things you're trying to sell. It may not be very important at all to somebody who is selling a very targeted niche magazine that could appeal to many different people and has no relationship in terms of a geographic reference. So, there are 62 Prism clusters, which means that we have predicted 62 different neighborhood settlement patterns. Another segmentation system is based more on predicting financial services behavior, or telecommunications behavior. In those segments, there are about 42 of the financial patterns, and they are anything from upscale suburban families with children, upscale suburban singles, upscale urbanites, those type of cluster types or segment types, and that's more based on a specific range of income, asset prediction, age and presence of children. So, those -- they're slightly different, but, you know, basically you can start with anything. In our audit of the convergence data, which is the telecommunications, I think we have about 57 different segments and they're based on patterns of usage that we have seen in terms of product usage, and then on the back end, we infer the demographic segment for that. MS. ALLISON BROWN: Ted? MR. WHAM: There's a distinction which might be valuable for the FTC in doing this, there's two major categories of lists that you can consider. One would be compiled list information, the other being response list information. Compiled list information tends to be very broad coverage, it's information about who you are, whereas response list is more information about what you've done, what type of products you've done. So, if I want to buy something that has a very broad geographic coverage because I'm offering a service that has something which is primarily defined upon where people live and the types of birds of a feather flock together type of analogy that is the basis for Claritas' business, then I am going to want that type of a compiled list. If I'm trying to find people who have interest in doing very specific types of activities and so forth, I am going to want to buy lists from similar businesses or businesses that point to similar types of people. Response lists tend to be very narrow. I can't typically take a response list and very effectively use that as an overlay tool against my universe of customers, and say tell me additional things about this, because if I took my, you know, 300,000 customers and matched them against somebody else's 300,000 customers, I might find, you know, 700 that match between those two of them. I would have a rich data set for those, but I wouldn't have enough to make it economically worthwhile to do that. Right now it's very easy to go from the hub out to the spokes. Go to a company that sells a specific product and tell me all of the customers for that product or set of products that they sell. It's extremely difficult to say that I want to start at a spoke and tell me all of the hubs that they're attached to, so go to a specific customer and tell me all of the products that they have bought within a category, or perhaps even all the products they have bought. I will say that although you can't do that today, there's an enormous economic potential there, and I am certain that many, many very bright people have spent a lot of time trying to figure out how I can come up with a master universe of all of the computing products that somebody has bought, or all of the clothing purchases that somebody has bought, because if I can do that, and if I'm a marketer selling, you know, an upgrade to a particular type of computer, that's the golden list, and I will spend a lot of money to rent names from that list. MS. ALLISON BROWN: Michael? MR. PASHBY: Yeah. I think in the magazine industry, one of the most important sets of data that can be added to a magazine list is catalog information, and the merging of catalog information, because it does add the recency, frequency and value component to the magazine list. If you go back to the woodworking magazine, a person may buy a woodworking magazine noting that they're interested, but if you can match that with catalog information about the purchase of tools or the purchase of other supplies, and they're showing some frequency there, that separates out one group of people who are peripherally involved to high-volume purchases within that area, and I suppose it also gives a greater degree of value to the broader lists, like a news magazine or a seven sisters magazine, those people may be then segmented into very specific interest areas. So, you have a -- one of the seven sisters, but you can match that with kitchen and food catalogs to show a high interest in cooking. So, it then becomes much more interesting for other marketers, and much more targeted to the consumer. MS. ALLISON BROWN: And what do businesses do to ensure that the data that you collect are as accurate as possible? Win? MR. BILLINGSLEY: Well, we do several things. Marketing data does not have to be 100 percent accurate to be effective, but you want to make it as accurate as you possibly can, within the economic constraints that you have to deal with. But an example of some of the things that we do to make sure our data are accurate, even if you permissioned us to use your data in a product registration effort, you say yes, I would like to receive offers from third party -- from third party marketers regarding products and services that would be of interest to me. You don't automatically go into Naviant's database just because you have permissioned us. To make sure that we're doing that accurately, we match your name and address against a public data source to make sure that you really are who you say you are. That helps us get out the Donald Ducks and the Roy Rogers and some people who like to play games, but we find the utilization of the public compiled data, a very meaningful tool to ensure that our file is as accurate as it possibly can be. MS. ALLISON BROWN: And can you just clarify what you mean when you say public sources of data and compiled sources of data? Can you be more specific? MR. BILLINGSLEY: Well, I probably misspoke, I probably should have said compiled sources of data which originated from public sources of data. But it's a very effective way to make sure that data is accurate. The other advantage that it holds for us is that we're very sensitive in not collecting data on children, and so by matching the name and a registration with an aggregator's data or a compiler's data, kids don't buy real estate property and cars and things of that nature. MR. WHAM: You haven't met my brother. MR. BILLINGSLEY: So, it gives us a reasonable check to make sure that we're not collecting data on children. The other thing that we do to make sure data is accurate is we use the DMA suppression file, and we find that a very effective way to make sure that we don't include data in marketing lists to the people who have gone to the trouble to go to DMA and sign up for either their direct mail suppression file or telemarketing suppression file, and a new product they started just a few months ago which is an email suppression file. So, that's another way to make sure that the data we provide a marketer is accurate. And the third way is the good old U.S. Post Office. All marketers use the NCOA process, or should use the NCOA process. MS. ALLISON BROWN: And what does NCOA stand for? MR. BILLINGSLEY: National Change of Address. And the way that basically works is if you move and you fill out a card at the Post Office so your mail will be forwarded to your new location, that information is collected by the Post Office, and the Post Office has this very large file of people who have relocated that's utilized to redirect their mail. And the Post Office authorizes some 20-something companies to take this data and do a match to make sure that if you have an old address in your file, and you match the old address, then you can substitute the new address. And that's something that's been in existence for a long time, it's been used in the direct marketing world for a number of years. It's a very effective tool to make sure that if you're doing a direct mailing of a marketing list, that the marketing collateral that you're spending hard dollars for to be delivered by the Post Office is truly deliverable. MS. ALLISON BROWN: Thanks. Michael? MR. PASHBY: Some information really has to be accurate. Some years ago I marketed a magazine, which I won't name, but, well, let's say a parents' magazine, and our primary source of readers were parents of newborn children. We were extremely sensitive to the problems inherent in that. Somebody's buying lists of potential new births, and some births obviously are not live births, and you are mailing to people saying congratulations, and that can be extremely sensitive, obviously. So, correcting data is very, very important. We spent an awful lot of time and energy making sure that the sources we were compiling that data from were accurate. If we found that there was an incidence of inaccuracy, we would cut off from that source. And we would not buy information from that source ever again. Because of the responsibility to the consumers that we had. MS. ALLISON BROWN: And can you be a little more specific about what the sources of that type of data are? MR. PASHBY: The sources of that data were from -- no, I can't, they were from compilers. It would come from doctors' office visits, from insurance companies, from a lot of different sources, I believe. MS. ALLISON BROWN: And what did you do to make sure it was accurate? How did you gauge that? MR. PASHBY: We would -- we would do it from the complaint level. That was the difficulty. You were doing it after the event, but if one found that there was a degree of inaccuracy there, then we would cut off from that source. MS. ALLISON BROWN: Ted? MR. WHAM: You talk about data quality issues, it's useful to look at it in two different ways. There's the quality of the data at the time that it's collected, and there can be errors introduced through typographical errors, or to purposeful, you know, fraudulence, Mickey Mouse and so forth, but there's also a more significant issue of data decay. Like if I, you know, show up in a database that I'm 25 to 34 years old, how old am I tomorrow? Okay? So, date range information is very inaccurate. Births, deaths, marital status and so forth, and people moving all the time, but we have a very mobile society. So, the statistic that I heard, I can't vouch, say, for this, but the average data in a data base decayed at a rate of about one and a half percent per month, that was the inaccuracy that built up over time. The marketer has an absolute vested economic interest in making sure that that information is as accurate as possible. If it's inaccurate, they can't use it for the goal that they have. So the alignment of the market interest, the consumer's interest of having accurate information is absolutely, I mean, perfectly together. MS. ALLISON BROWN: We have time for one more comment and then we will go to questions from the audience. Betsy? MS. ELISABETH BROWN: One of the things that I wanted to talk about data accuracy is that from the Claritas standpoint, we've seen a lot of different types of data. We not only use Census data and other public domain data, consumer surveys, which is really self-reported demographic information, but in order to -- as I was talking about implementing, in order to actually implement an actual marketing program, we will take our segmentation codes and place them on list files, such as Acxiom, InfoUSA, Experian and Equifax, and many other compiled lists. What we have found many times, especially when we're using the types of models that I discussed earlier that go down to a more specific household level, in terms of the demographic variables that we say are predictive of the behavior that we're trying to help our customers use, what we find sometimes is that these list sources have, I guess, decay, some other information, missing information, fill-in models, and we will show them that the data that we have proves out that their list is not really distributing the way the U.S. population distributes down to a low level of geography, a zip code, a census tract, a block group. So that we can take a look at a list of data out there and say you're reporting that only two percent are in the income category, 50,000 plus, and we expect to see more like 27 percent. So, we have actually created models that help some of these list sources to improve their models, their income models or whatever that might be, to base them more on sort of a benchmark of data. So, there's a lot of -- it's sort of a symbiotic relationship, back and forth with Claritas and the list providers, sometimes they actually do change some of their model information on their file based on our information, and other times we just use it to assign what we think is a more appropriate segment code, then they don't necessarily change that source of data, it depends on how they prioritize their models, and they prioritize their input sources. MS. ALLISON BROWN: And I believe that Claritas also updates Census data, how do you do that? MS. ELISABETH BROWN: On an annual basis. We update census data, again, from a list of a lot of sources, some of the postal information, some of the moving information, NCOA. There's a lot of intercensal data that is produced that's not produced on 100 percent factor. In other words, there are many, many counties, communities and states that do many updates of data and information, and we take really whatever we can get that's available and utilize that data. There are also many models that we have perfected over time, and we've been doing this, this is our third census that we've been actually updating information where we just do projections and straight line information based on other data. So, there are many sources that we can use, both census-type sources that we think we can have a high degree, feel that we have a high degree of accuracy in terms -- and relevance, and some of the consumer survey research that's out there just allows you to take a look at shifting data in terms of how people are self reporting where their incomes are. And in addition, we do use a lot of the list data just to try to get a handle on which areas are growing. Postal drop rates, I think ADVO counts, which is another list source where they constantly are updating where the postal drops are going. MS. ALLISON BROWN: One thing that becomes clear pretty quickly is how integrated the aggregators are with the sources and how the data sort of rotate in and out of the different databases. I know when I open up the discussion for questions from the audience, if you have a question you would like to ask, please raise your hand and I will recognize you after one of our staffers comes over with the wireless microphone. Please speak into the microphone while asking your question and state your name and organization before you begin your question so the court reporters can get an accurate transcript of today's proceedings. MR. CATLETT: Thank you, I'm Jason Catlett from Junkbusters. I have a question for Mr. Billingsley. I have an advertisement in a trade magazine from Naviant, it's quite amusing, it shows a biker with tattoos and a beard, and it makes light of the fact that he likes roses, and when you're going online, you might want to -- I infer from this advertisement -- you might want to pitch a banner advertisement for roses. Could you please tell us the process by which when this biker goes online and visits a website the website would know that he likes roses? MR. BILLINGSLEY: Well, I'll talk a little bit more about that this afternoon, if you would like, because we'll talk about how the data is used to administer marketing programs, but basically, we would have business relationships with some of the ad serving companies that collect data anonymously. We would pass data attributes to those ad serving companies anonymously, so that they could then target a banner ad that was appropriate for that particular person, without ever knowing the person's name. MR. CATLETT: Thank you. MS. ALLISON BROWN: Don't forget to say your name and affiliation for the record. MR. HENDRICKS: Thank you, Evan Hendricks, Privacy Times. I had one question, but first I wanted to follow up on what you said about the babies, because we always wondered about that, a lot of us. So, is it the doctor's offices would sell that information, or the insurance companies were some of the sources for people who are about to have babies? MR. PASHBY: I am not absolutely certain, I believe that was, and this was some time ago. MR. HENDRICKS: But I also wanted to comment, hospitals and birthing classes, and do they sell it to a compiler, is that how it would work? MR. PASHBY: It's my belief that that's how the information was compiled. MR. HENDRICKS: Okay. The other thing is you said that the magazines, I think correctly, are at the front end of this process, much more so than some of the others who are at the back end, and in the UK, on a subscription form, the little cards that you get in your magazine, you have a check-off box, it says if you don't want your name shared, check here, and send it in with your subscription, and one of the big problems in the U.S. is that at the point of the collection of data from individuals, people are not notified what could happen or given the chance to even opt out. And so, do you think that makes sense from a data practices point of view, and do you think that your association is ready to sort of endorse that and recommend it, you know, considering the growing strong feelings about privacy? MR. PASHBY: I think from the standpoint of having to fill in, check a box on a card, what we found in any promotional activity, having the consumer take actions in a promotional activity reduces the response. Therefore, we have cards which are prechecked, and yes I want this magazine, and then all they have to do is tear the card out and put it in the mail. But as I mentioned, we also do publish in the magazine the privacy policies and the ability to -- and the ability to call an 800 number or send to the magazine fulfillment house to be taken off the list. MR. HENDRICKS: And of course what I'm describing wouldn't even, I mean someone could still take the card and just throw it in the mail. It's only those people that took the time to look and see that there was a check-off box, and could check off they didn't want their name sold. So, what I'm saying is would it interfere with, you know, with what you're saying? I mean, it wouldn't require the individual to check the box to say I don't want my name sold, it would only be for those individuals that cared enough. And if this is practice -- am I confusing you? You look like you're not following me. MR. PASHBY: I'm saying that any time there is -- you give people the option in a promotion, the response declines. And as we mentioned before, the whole use of information has been more effective and more efficient when we are spending or when businesses are spending 65 cents to a dollar to put a piece of promotion into the mail and you're getting single digit responses, you're trying to be as efficient as possible. MS. ALLISON BROWN: Ted, do you want to comment on that? MR. WHAM: Yeah, I absolutely would. The basic fundamental question is if I -- if consumer X chooses to do business with Business Y, should consumer X have the opportunity to say Business Y, don't contact me. That's question A. And question B is, Business Y, don't share my information with company Z and Z sub one and Z sub two and so forth. I fundamentally reject the notion that a consumer should be able to say I want to do business with a particular company Y, but that company can't follow on and make money out of that relationship. I think that that has terribly negative consequences for the efficiency of economic transactions in this country. The reason we don't have mom and pop stores in the United States very successfully anymore and the reason we have Wal-Marts in this country is because they provided a very economically efficient way of delivering low-priced goods in the United States, for better or for worse, but the wheels of that continue to turn by having the businesses be able to use that information in the most effective way possible. MS. ALLISON BROWN: We are trying to stay on a factual level here and stay away from policy discussions. MR. WHAM: I couldn't help myself. MS. ALLISON BROWN: Does anybody else have a question? MR. DIXON: Tim Dixon from Baker McKenzie. A question, just to pick up on that point to take it a little bit further. When we talked, particularly when you mentioned the 30 million permissioned people or households in the database that you've got, what proportion do you know is that people who have done the sort of check box as opposed to the kind of I guess you could call it permission by inertia where they would need to read a privacy policy and then go through an active process of say opting out if they wished to opt out? MR. BILLINGSLEY: I don't know the percentage. We use in collecting the data, and this is primarily a decision that's made between us and the client that we're providing registration services for, we use three different kinds of permissioning processes. I'll try to get through this without confusing myself and the audience, but we use the opt-in process, which we define as a permissioning question with either yes or no, not preselected. We also use the opt-out permissioning process, which is a permission question with yes preselected, and in certain situations, not a lot, we use the explicit process, which basically is a bold statement that says, Do not provide us your marketing information unless you're willing to receive, you know, marketing offers. So, we utilize all three of those, depending upon the circumstance. We do flag how the permissioning process worked for that particular consumer, and we are sensitive based on the permissioning process, how that information is used when it is -- when a marketing program is generated based on that permissioning. But the percentage, I don't know the number to be very specific about your question. MS. WOODWARD: My name is Gwendolyn Woodard with Worldwide Educational Consultants. I'm consumer A, and I decide that I'm going to attend a conference, so I go online and complete the form. The site that I'm going to complete the form on has a third party advertising network associated with it, okay? As I complete the form, I notice in the URL the information that I put in the form is reflected up there. So, as a consumer, how would I know how that information is going to be used, what databases will it be going to, especially if this third party advertising network uses a push and pull technology to disseminate that information to different databases? MS. ALLISON BROWN: Does anybody want to take that on? MR. WHAM: It's very useful if you're omniscient. MR. BILLINGSLEY: I'll respond a little more. The -- MR. WHAM: Comprehensively, perhaps. MR. BILLINGSLEY: Yeah. The way it should work, in my opinion, is if you're in that kind of situation where a redirect is occurring, without your knowledge, then the privacy policy should be very explicit in saying -- in discussing the redirect to another website, why that is occurring, what your choices are to either participate in that or not participate in that. And disclosure, in my opinion, is the key for the consumer in understanding what is or is not happening to their data, particularly when you see it in the URL. MS. ALLISON BROWN: And let me just say that that's really a question that should be directed to network advertisers, and none of the panelists up here represent any network advertisers, and it's really a separate issue that we're not addressing today. But, you know, that's a question for other people. We are running out of time. Paula, did you want to comment on that issue? MS. BRUENING: No, thanks. |