Data: just a means to an end

Posted on:
10-22-2020

Everyone is excited about AI. Andaccumulating vast stores of data, in part to feed AI’s voracious appetite. But, without purpose, process, and precision underpinning efforts, you could just be drowning in the stuff.

“Data, data, everywhere, and all the servers nixed. Data, data everywhere, nor any process fixed.” – Rhyme of the Ancient Mariner, er, CTO.

 It’s 14 years since Clive Humby coined the phrase “data is the new oil.” Amazingly, it’s still promoting headlines and articles (including this blog; ironic). But if it’s really still true – and given oil’s fall from favour over the past 15 years, we might want to rethink that – then we need to factor in how its uses have changed over the intervening decade and a half.

Humby is the British mathematician who set up the agency that created and ran the phenomenally successful Clubcard loyalty scheme for UK supermarket giant Tesco. His full quote gives us the clue about the value of data today: “Data is the new oil,” he told a marketers’ summit at Kellogg Business School. “It’s valuable, but if unrefined it cannot really be used. It has to be changed into gas, plastic, chemicals and so on to create a valuable entity that drives profitable activity; so data must be broken down, analyzed for it to have value.”

The emergence of practical artificial intelligence technologies over the past 10 years represent new ways of using data – imagine oil before and after the arrival of the internal combustion engine. The more data you throw at machine learning-based systems, for example, the smarter they look in terms of their predictions or other outputs.

But here’s where things get tricky for the modern IT strategist or data scientist. We’ve created massively data-hungry systems. We can gather and manipulate data in vast quantities. We’ve created some incredibly clever algorithms – and even some basic ‘intelligent’ systems that might use it. But how well are we mapping this ‘oil drilling and refining’ capacity onto real-world problems and interactions?

 

Ditching the metaphors

The first step is probably to get our definitions right – there’s only so far you can stretch the oil metaphor. Let’s start with what ‘data’ means to people.

One major challenge in these conversations is that ‘data’ comes with a series of biases. To some people, it’s a trigger word for risk – security and privacy, having too much or too little, letting it skew decisions. Security concerns, in particular, can lead to risk aversion. But ultimately, ‘data’ is ones and zeroes, and when it’s just sitting on a hard drive or in the cloud, it’s both neutral and worthless.

But at some point – and in the right contexts – data becomes information (when it’s consumed or processed) and knowledge (when it’s aggregated and used). Even then, there are clear distinctions between useful and worthless information; pertinent in context, or irrelevant.

New call-to-action

So while all the data we capture – about people, processes and systems – and store reflexively today isn’t in itself worth much, the opportunities created by each organization’s ‘data lakes’ are huge. But they need cleaning up. We need to understand what can be useful and where; address bad data; and put some order on the good. And developing a holistic view of what the organisation wants to do is also crucial. We can churn through so much more data if the mission is clear; and we know what data affects those outcomes.

 

What we do with the data

That sense of purpose is not a given. As data storage costs have become exponentially cheaper, the idea of having to decide what data to generate, collect and keep has dropped out of fashion. In the earliest days of computing, engineers and programmers were incredibly adept at avoiding having to accumulate data (the Y2K bug being an obvious consequence).

Cheap hard drives and now cloud storage, stopped that being a factor on the ‘supply’ side. On the ‘demand’ side, organizations started to realize how many new and fantastically successful businesses were emerging that were predicated on knowing everything about customers (and other stakeholders).

So a new normal emerged: keep everything. Use what we can. Sell what data brokers are willing to buy. Store the rest until it falls into one of the first two categories. The EU’s General Data Protection Regulations are a direct consequence of that approach to the trawling of data – and specifically demands organizations have a purpose to retaining any data.

A different way of looking at it might be to articulate the outcomes of any activity, then work back. (That’s why the GDPR allows for data retention for specific purposes related to a customer.) To return to the Humby analogy, instead of drilling for oil and spending money storing it, perhaps we should be asking what we use plastics for, say, or how we get around, and designing our approach to match.

 

Fuelling the process machines

Clarity of purpose is the vital first step. But then we quickly run into the challenge of how we process data to those ends.

A process-first mindset is at the heart of our approach to digital transformation – particularly diagnosing, designing and automating processes to improve efficiency and deliver optimal outcomes. You might argue that processes ought to be data agnostic. But once we move out of the philosophical and into the practical, it becomes clear very quickly that the data an organization collects, generates, manipulates and deploys not only affect the outcomes of processes – they shape what those processes are designed to do.

Conversely, great processes can ensure the optimal outcomes are delivered efficiently. But they can’t address a lack of data, the wrong data, or unclear outcomes. This is one way of looking at the importance of strategy and transformation programs to the kinds of IT work we undertake. If you can map your desired outcomes (the strategic intent) and redesign processes to deliver them (the transformation piece), understanding what data you need becomes a lot clearer.

In a call centre, that translates very simply to equipping agents with information about a customer’s context for an enquiry, not just raw data about the things they’re asking about. You turn a 10-minute call reciting data back to them into an efficient one-minute call addressing their needs and improving overall customer experience.

The Humby version? Having a refinery that churns out fuel oils is no good if your customers want thermoplastics or solvents.

 

Discipline in using data

Knowing what you’d like to happen (better) as a result of using data; and having a clear picture on how processes would operate (or evolve) using data are laudable objectives. But the world of data still has a problem with neatness. The news often reports AI technologies improving their interpretation of ‘unstructured’ data. But even the structured stuff is often badly sorted, labelled, stored and used.

(Score another for the oil analogy: crude oil is a mess of different hydrocarbons and data is a mess of useful and distracting.)

A glance over at the UK recently highlights a case where imprecision in data caused a major problem. The late-arriving track-and-trace system for Covid-19 only lasted a few days before it was revealed that it had miscounted positive test cases. The reason? The test data was being stored in a .csv file that had been mistakenly imported to Excel – and the cases had been assigned to columns, not rows. Result? After column XFD, the data just stopped – that’s as far as Excel goes.

In more mature knowledge businesses, there tends to be a more sophisticated view of data. Where a business has always relied on it – and trade on turning it into information and knowledge – we typically see a more granular approach. That’s not to say manufacturers and logistics companies, say, are not hungry to use data better. The digitisation of everything over the past couple of decades essentially means turning physical objects into bytes. And in many cases, that’s meant they’ve been able to flow the data into existing processes more simply, partly because they’re designing new data capture methodologies with more rigour that some legacy systems.

 

Getting to better

We see issues arise with data in all three categories. And often the information and knowledge created from data by existing processes causes problems too. The data looks good; the processes work; the objective looks fine. But in a case where it tells a manager they’re spending too much, for example, they might end up using it to justify drastic decisions that are counter-productive.

(Generally, it’s easier to address the right and wrong use of data – and how the processes that use it might get better – than the more visceral concerns about privacy.)

Data lakes we can handle. ‘Knowledge lakes’? That’s more worrying because it means there are a series of assumptions baked into processes that have already delivered ‘insight’ – and might have created a lot of preconceptions.

It comes back to other themes we see cropping up time and again in our work. By taking a more holistic view of the organisation, and especially how it serves its customers (whether internal or external); understanding how to design processes to meet those needs; and then deciding how to gather and sort the data to make those processes work more smoothly, we can transform organisations and help them get a grip on the great data explosion.

 

 

Subscribe
6 minute read