Why is Big Data big? Obviously not because someone invented a new statistical method of data analysis that can explain everything. It’s because there is a lot more data spewing out of business processes where there were none before. The questions haven’t changed – as a business manager I still want to know how to forecast sales or understand which levers to apply to improve my business outcomes. But where there was once no way to answer these questions, today there actually is data that can be analyzed to provide some of these answers.
So far all the attention has been on business problems whose answers lie in the copiously flowing data from associated business processes. Marketing, for example, where all Marketing is getting subsumed into Digital Marketing which, as anyone will tell you is a data gusher.
But there are innumerable questions in business that need answering where the data just isn’t there to analyze. Or rather, the data is there but it is unusable, just beyond reach.
I am part of a non-profit theatre company in the Bay Area. Theatre companies live and die by their ticket sales. All costs are fixed. Once you decide to stage a play your costs are all locked in. Your revenue, however, is completely variable, by the number of tickets you sell.
In such a business, it would be crucial to understand where one stands on ticket sales. In other words, you should be able to answer the following question at all times “Based upon the ticket sales today, X days before opening night, we are on track to fill Y% of seats”.
Almost all ticket sales are online. Which is a good beginning. The online ticket selling service that we use sends us a daily email with our cumulative ticket sales till that day. But, and here’s the nub, they don’t store a time series of daily ticket sales. So if I wanted to draw a graph with number of days to opening night on the X axis and cumulative tickets sold (by show, of course) on the Y axis, I’m out of luck.
This is not a unique situation. I’m sure you can think of many such examples in your business where you know the data is created but it isn’t kept, or it isn’t in the right form.
Then there’s a form of data that is not captured but can be, with just a little work. At Infosys, when I was trying to wrap my head around how to implement CRM in a company which hadn’t used one for years, I thought that it might be too much to expect the field force to make notes after every client meeting. But perhaps, if they could just log every client meeting, that by itself would be very useful. It would be a measure of business activity which we may be able to correlate to deal value and perhaps, could serve as a rough, early warning forecasting system. There are so many opportunities for squeezing a business process for meaningful data. Analyzing this data, typically doesn’t need Hadoop clusters, but the business outcomes could be quite significant.
Think of data as fossils in sedimentary rock. The fossils in the upper layers are newer, better formed and easier to interpret. The ones in the lower layers are just the opposite. But they are just as important to understanding and improving your business.
IT Services companies will see a lot of opportunity in Big Data and Analytics. But software companies will take away most of the value in the top sedimentary layers. The lower layers will be messy. And straightening out messes, is where IT Services companies thrive.
Meanwhile, I’ll be trying to sort out my ‘messy’ ticket sales data using Google Script. If anybody knows of a script that will help extract a number from an email that follows an identical string of text, please send me a note. Thanks.