Below is an article we published in the run-up to Jefferson’s presentation at the IRM data management event in London on 7-9 Nov 2011.  It summarises the main points from the predictive analytics series we’ve been running over the summer and we thought it might be useful to post it here as well; if you only end up reading one from the series, this is the one to read.  Please let us know if you find it useful, so we can decide whether to do similar summaries on future series.

Thanks, Jefferson and Nicky


Many companies already run well-controlled, lean processes and so they are increasingly turning to their data as a new means of competitive advantage. Predictive analytics has been around for some time, but is only now becoming more main stream. Many businesses are unsure of what it is or where it fits into an overall information strategy and so in this article we take an introductory look.

What is predictive analytics?

With predictive analytics the aim is to predict future behaviour or events using the data that you already hold about things that have previously happened. For example, based on your previous journeys from home to your office, you may know that the drive usually takes 40 minutes if you leave after 8.00am as opposed to 20 minutes, half the time, if you leave before 7.30am.

Based on the analysis of historical data (the correlation between the time you leave and the length of the journey) you can predict that if you left tomorrow morning at 7:20 you would arrive at 7:40. If you weren’t a ‘morning person’ but you had an important business meeting at 7:45 that you couldn’t be late for, this simple predictive model would be extremely valuable to you because it would allow you to get an extra 20 minutes in bed.


The role of data mining

Typically the relationships in the data are not as simple as they are in the car journey example, so we need some help to identify patterns in the historical data and so we use data mining. Data mining means applying mathematical formulae to a set of data, with the aim of identifying relationships between variables in the data that are not obvious at first glance.

The mining is what tells us the equivalent of ‘If you leave before 7:30, it will take you 20 minutes; after 8:00, it will take you over 40 minutes’. If someone has the skill and knowledge to interpret the output from the mining it can provide insights that can greatly increase their future ability to achieve full potential in whatever they want to achieve, whether that is to target new customers more effectively, identify fraudulent insurance claims or simply get home to see their family on time.


Predictive analytics business applications

Many people first heard about predictive analytics in the financial services industry, where it is used in applications like credit scoring. Information such as someone’s repayment history is used to predict whether they will keep up future payments on a loan and predictive analytics is also used to help identify financial cybercrime and fraudulent insurance claims.

In fact, predictive analytics is used for many purposes. A major application of predictive analytics in many industries is in the areas of direct marketing and customer relationship management. For example, marketers use predictive analytics to predict who is likely to respond to particular offers. The prediction is often more accurate with existing customers because there is more historical data from which to predict and this is one of the reasons why the use of loyalty cards has increased so much in popularity with businesses in recent years. The personalised sales data collected via a loyalty card scheme enables a business to spend their money only on sending customers offers that the predictive model says they are likely to say yes to, so they reduce the money they waste on sending offers that are not of interest and also avoid irritating their customers with what they perceive as ‘junk mail’.

To make this real, here’s an example from a company we’ve worked with in the telecoms and media sector. They worked hard to build up what many companies call a ‘single view of customer’; in this case that means the ability to understand someone as a member of a household. One of the things they wanted to achieve was to decrease the churn of profitable customers.

When they mined their data, one of the patterns they discovered was that households containing a father and at least one child, who had a sports TV package, had a much lower churn rate than similar households without sports TV. Thanks to that insight they began a successful campaign offering heavily discounted sports packages to households that fitted that profile, because the lower revenue achieved from the discounted TV package was more than offset by the increase in revenue and profit from retaining such customers for longer.


Predictive analytics techniques

Two other well established techniques that companies apply to the data they already hold are ‘customer segmentation’ and ‘next best offer’.

Customer segmentation involves breaking up a population into ‘segments’. The segments are often based on demographic data and are derived by data mining. Each segment contains clusters of data points which all show similar behaviour. For example, in the travel industry, such a model may show a strong correlation between people in the age range of 21 to 30, with an income above £40,000, who are married or in long term relationships and the successful historical sale of long-haul package holidays with a luxury resort in exotic locations. If a marketer knows that many people with these characteristics purchase luxury holidays, then they can target this segment of the population with luxury holidays and expect to have a higher degree of success in selling more holidays than if they just sent the brochures to the population at random.


‘Next best offer’ is applicable to any industry in which interaction with customers is via multiple channels in near real-time. Typically the channels would be a call centre and the Internet. Let’s say that someone has just phoned a call centre and booked two return flights from London to New York, flying out on Friday and back on the following Monday. Based on historical bookings of similar flights, the predictive analytical system may score the next best offer as a hotel room in New York for two from Friday until Monday. The call centre can immediately offer to book a hotel in New York for the customer but let’s say that in this case the customer declines and politely ends the call.

The couple then access a website that the travel company sponsors and are recognised as the customer who recently made a purchase on the phone. The predictive analytical system knows about the flight purchase and the decline of the hotel and scores the next best offer as the hire of a car from JFK airport so, in real-time, it makes a discounted car hire offer in the advertising banner on the website.

Let’s say the car hire offer is also declined. The process of re-scoring the ‘propensity’ of that person to buy continues and a further offer is made in real-time with other personalised and relevant offers on the next web page. As you can imagine, this is only now possible because of the recent improvements in master data management.


Using predictive analytics as part of a data strategy is an extremely effective way of using the data a company already holds to directly improve its business performance. Another great benefit is that its impact can be seen and appreciated by everyone in a business, not just the data team!

More advanced applications of data such as predictive analytics are a clear next step for organisations which already have the basics in place. The rewards are much easier to reap if the underlying data is managed in an agile and business-responsive way.