How clean is clean?

If one buys the conventional wisdom that a Spend Analysis (SA) system is simply a data warehouse/data viewer combination, that doesn’t leave much room for product differentiation. Most data warehouses are essentially the same, and so are most data viewers. This is why SA vendors tend to compete on the quality of their services, not of their software. These services are referred to as “data cleansing,” and sometimes the terms “aggregation” or “enrichment” are used – but basically they boil down to three processes:

  1. Integration of multiple data sources;
  2. Familying and/or grouping of suppliers;
  3. Commodity mapping of spend.

Various approaches to these processes are taken, but most SA vendors use a mix of third-party service providers, some automation where appropriate, and/or internal clerical resources. Much is made in marketing literature of various artificial intelligence techniques, such as the ability to recognize an item description and accurately classify it – but in practice, advanced data cleansing is quite simple to do, requiring only some intelligently-designed tools and some attention to detail.

Sourcing consultants often chuckle about this. Old hands at the consulting business know that applying common-sense 80-20 rules to spend categorization (for example, noting that at many companies 90% of spend occurs with the top 200-300 suppliers) makes the “cleansing” process quite straightforward. Building an actionable spending dataset by hand, as these consultants used to do back in the early 90’s (and still do today, in many cases) is just a matter of rolling up one’s sleeves and hacking at the problem with Access, Excel, and a modicum of intelligence. Billions have been saved using hand-built one-off spend datasets.

I recall a 2001 meeting in which a start-up technology group tried to sell us their “AI” techniques for spend classification. Our sourcing consultants, veterans of McKinsey, A. T. Kearney, and The Mitchell Madison Group, and thought leaders on spend analysis since the late 1980’s, listened politely and showed them the door. So did everyone else in the space at the time. Ultimately, these technologists formed their own SA venture and sold it to a very large company that should have known better.

I also recall a 2002 conversation between two e-sourcing companies, sponsored by a common investor. After a rocky start to the meeting, all the puffery and pretense about “proprietary methods” and “proprietary databases” for spend analysis suddenly dissolved into smiles when both parties to the meeting realized that everyone in the room already knew the “secret sauce” behind commodity mapping: map the GL codes, map the suppliers, then map GL + supplier to catch suppliers who sell more than one commodity. See for a more complete description of this core methodology that’s been in use since 1994.

Which brings us to the next guiding principle of Web 2.0 Spend Analysis:

  • Data cleansing can and should be done by end users themselves, not by spend analysis vendors; or, if insufficient resources exist to perform the initial cleansing, subsequent data maintenance should be performed by end users.

Business users know their own data and their own vendors best. Armed with appropriate and user-friendly tools, there’s no reason why they shouldn’t be empowered to create, clean, organize, and maintain their own datasets.

As Mike Smith from Hanover Insurance remarked to Debbie Wilson in a Cool Tools article earlier this year, “We know our business much better than any consultant, and that knowledge allows us to write much more granular rules. Here’s an example: everybody knows what IBM does. But does everybody understand how I use IBM?”

Tomorrow: “Change” does not equal “Refresh”

Still quiet

Leave a Response