In this era of “Big Data,” I often hear terms like data and information being used almost interchangeably; and so I thought that I’d write an explanation of these terms and also include what should be the ultimate goal of data analysis – “insight”.
Data? Big or Small.
Data is raw. It has no meaning on its own. On a website, it would be the IP address of a visitor. For weather, this is the temperature, air pressure, humidity, and so on. For a class of students, it is their individual grades.
And for the grammar police out there, it is now acceptable to use the term data as both a singular or plural noun. Yes, I know standards have fallen and the world is coming to an end but who wants to use the word datum anyway?
One of my favourite “new words” that has emerged in recent times is “datafication.” Remember digitisation? When we converted our analog music recordings into digital versions? Or scanned our old photographs and saved them on computers as digital copies? Well, datafication is the process of storing digital copies of raw data that is currently not being recorded – because we can and because you never know what may or not be useful. This is the original definition of this term, although it has been hijacked to mean the turning of a business into a “data business” (whatever that is) more recently.
As we begin to see the rise of the “internet of things,” datafication will become increasingly more commonplace and popular. It is already being used in the agricultural industry where farmers can collect hundreds and thousands of data points about their land covering soil moisture, air temperature, nutrient levels, etc.
And these same devices are being rolled out to consumers right now. Yes, you can collect soil moisture levels of your house plants if you like but you are also datafying your movement and sleep patterns with products like FitBit.. or are doing something similar about your dog with a product called FitBark.
How information is created
So data is raw, unorganised and useless on its own. It is only when we start to process and organise this data that it starts to become information. This is when we interpret the data and give it meaning.
We can analyse web visitors and turn those IP addresses into geographical locations and map where our visitors come from, or calculate the historical average temperature for a location based on past weather data, or simply report on the average grades of those students we mentioned earlier. If you use FitBit, you can see how many steps of you’ve taken each day and have that converted into miles walked or understand how frequently you awaken in the night by processing the data about your sleep.
But that is not insight.
All of this is information but none of it is insight (or insights as you see it sometimes).
We gain insight when we draw conclusions from data and information. And this is really the ultimate goal for data analysis and the world of “big data”.
Knowing the average temperature for your location might be helpful, but knowing that the average is increasing steadily every five years gives you a better understanding and puts the information into context. And that context is really only relevant to you or your business – information about buying habits of FMCG consumers will provide very little insight about the customers of your small B2B accountancy practice.
There are two ways to go about gaining insight. First, you go looking for it. Create a hypothesis and seek to prove or disprove it – scientifically. This would be something like “what is the effect that social media mentions has on my sales orders?”. You then ensure that you are collecting all the relevant data, organise it and then look for the causation and correlation.
This is the pretty standard approach to data analysis but it has some flaws. The main one is that you don’t know what you don’t know – in other words, you will only find what you are seeking. What if social media mentions didn’t cause a direct impact on orders, but it did have a positive impact on sentiment, affiliate activity, or even contract extensions/renewals? If this wasn’t the outcome your were seeking, you may have missed it.
Correlation is all that matters
When it comes to Big Data, there is another school of thought – one where causation is ignored or is deemed to be irrelevant and where correlation is the only thing that matters. New data tools are being released focused on correlation analysis which is the process of analysing data with little regard to the logical relationships and focus purely on data relationships.
A classic example of this is with Walmart. When they applied correlation analysis to their massive sets of data, they identified that sales of PopTarts increased in an area that had been issued with a hurricane warning. Now who would have hypothesised that relationship? This insight was clearly “actionable” and so they monitored the weather and moved the product to more prominent positions and increased sales further.
Umm, so what is insight again?
As we’ve discussed, data is the raw product that when structured and organised becomes information. This information is useful as it is – but when we are able to extract something that may lead to competitive advantage – something novel and profound – then we are gaining insight. And this should be your ultimate aim with any and all of your information systems right now — actionable insight.