Predictive Analysis – Dataconomy

4 Ways Predictive Data Analytics Changes How Consumers Behave

Aditya Rana — Wed, 07 Dec 2016 08:00:12 +0000

Smartphones have made it possible for businesses to monitor you at all times. Take a company like Google for example. You may look up the name of a restaurant over Google Search, turn on your navigation to the destination using Google Maps and perhaps also check for weather and traffic updates along the way. The amount of information you provide to Google here is pretty exhaustive and is a treasure trove in the hands of a data analyst.

Privacy advocates may have their reasons to be concerned about users providing commercial entities like Google access to so much information. This article is however all about the different ways these tech businesses are transforming these billions of data points into something extremely useful and possibly revolutionary in the area of predictive data analytics.

Google Play Music

Music apps have been making use of historical data to recommend new playlists for a long time now. But Google is doing something extraordinary with their new Play Music. Recently, the company launched a revamped Play Music that will make use of dozens of data sources to recommend music more accurately than any other product out there. The main source of information is of course the music you have listened to before. But that is not all. Google now uses a host of other factors influencing your music preferences. For instance, you could pick classical music at work, peppy songs during your gym session and perhaps romantic songs while you travel. Google’s machine learning algorithm now intrapolates your music preference with other factors like location(at work or at gym, for example), weather (raining or sunny) and even other details pulled from your email or calendar to find the perfect playlist recommendation for you.

Uber Restaurant Guide

As a service that, among other things, transports people to and from restaurants, Uber has pretty valuable data points that can tell a user what restaurants its customers prefer to visit in any given location. Uber is now coming up with a restaurant guide that uses this data along with other real-time information about the number of drop-offs, the type of vehicle used and trending locations to prepare its restaurant guide. The number of drop-offs could perhaps tell you about the popularity as well as waiting times, type of vehicle could be an indicator of how upscale the restaurant is, and trending locations could be used to recommend restaurants to users who do not have any specific destination in mind. As of now, the Uber restaurant guide is only operational in twelve cities across the US although this is likely to go up in future.

Apple’s Siri Experiment

If there is one product that has brought machine learning to the mainstream, it is perhaps Siri. The voice assistant on the iPhone makes use of deep learning (which is a tad different from traditional machine learning) for speech recognition, natural language understanding, execution and voice response. Ever since it was incorporated into the iPhone, the software has undergone a sea change and uses machine learning incorporated through deep neural networks, convolutional neural networks, long short-term memory units, gated recurrent units and n-grams to cut down its error rate by a factor of two. Besides Siri itself, Apple also has ingrained machine learning into all of its products right from showing reminders for appointments you never got around to entering on your calendar, showing map locations of hotels even before you type it in and also detecting fraud on the Apple Store.

Facebook FBLearner Flow

The amount of data stored and processed on Facebook is humongous. The earliest users of Facebook today have over ten years of photos and videos stored on their timeline which needs to be pulled up anytime it is requested. Now take into account the over billion monthly active users and the sheer scale of the challenge becomes apparent. Last year, the company made its AI backbone called the FBLearner Flow available company wide. This platform is what controls every minute aspect of machine learning and AI within Facebook’s many products. Aside from plainly obvious features like deciding the right kind of content and friends to show on the timeline, FBLearner Flow also includes models for many intricate machine learning programs. For instance, one model helps Facebook provide auto-captioning of videos to its advertisers. Studies have shown that captioned videos bring about higher engagement levels than regular videos and can boost viewing time by as much as 40%. Quite evidently, such machine learning scripts are critical in bringing more advertising revenue. Such internal machine learning models have also helped Facebook reduce its reliance on third party tools to translate the nearly two billion news feed items each day (for which Facebook used Microsoft Bing’s Translation tools earlier).

Most of these machine learning innovations are not immediately evident to a layman user and pass off as a small addition towards better user experience. But in each of these instances, the companies have to deal with millions, if not billions, of data points to analyze, execute, test and relearn concepts. It will be interesting to see where these various experiments lead us to over the next decade.

Like this article? Subscribe to our weekly newsletter to never miss out!

Follow @DataconomyMedia

Image: kyknoord, CC 2.0

Machine learning can predict Game of Thrones betrayals

Joe Gershenson — Fri, 15 Jan 2016 09:30:35 +0000

A few months ago, Airbnb ran a great post about how its trust and safety data scientists build machine learning models to protect users from fraud by predicting bad actors. As the piece illustrated using Game of Thrones, a highly nuanced model is required to determine something like whether someone is “good” or “evil.” But what if people aren’t just born good or evil? What if they change over time? And wouldn’t it be great if you could not only predict whether or not they would betray you, but answer the question of when they’re likely to do so?

Applying Predictive Models to Sales & Marketing

In the predictive models our team builds for sales and marketing, the challenge of prediction over a time period is especially critical. We’re looking to uncover hidden states that can identify the precise time when someone is getting ready to make a purchase. Inspired by Airbnb, we’ll tackle another machine learning model for fantasy characters, but add a degree of difficulty that’s common in the real world of sales, where you want to know precisely when to reach out to a hot prospect. If you pretend that a potential buyer is actually a citizen of Westeros, and blur the lines of “good” and “evil” in the Airbnb model, you have to consider that everyone is a potential candidate to betray you (aka buy your product) at any time.

So, how can you predict when someone is ready to make their move (or purchase)? Our first challenge is turning our training data — a list of behaviors or activities by different characters — into features that we can process into our models. We’ll start by associating these activities with the characters that are responsible for them.

Behavioral Scoring Approaches

One approach might be to count the total number of activities associated with each character, and use that to train our predictive models (this is similar to the way marketing automation systems score leads). Unfortunately, that won’t allow us to distinguish between activities that occurred in the past vs. recent developments. This is particularly important when trying to predict actions that might occur in the near future.

On the other hand, we could just look at the number of activities that have occurred in the recent past. This definitely helps us keep up-to-date, and solves the problem of ancient data biasing our evaluations. But what if a character hasn’t done anything recently? We’d still like our estimate of their trustworthiness to be influenced by their past actions. And we’d also like to keep some history around, because what seemed like a one-off event in the past may turn into a significant pattern and can shape future decision making.

We can therefore benefit from a hybrid approach. Suppose we combine features in the model that target activities from the entire past with a set of features that target recent data? In addition, we can use a series of windows to treat activities from the recent past differently. That way, we remember what happened three weeks ago, but we don’t give it the same weight as something that happened yesterday.

Tracking a Moving Prediction Target

It’s important to remember that the hidden state of a character can change over time. To see how this can impact our prediction target, let’s take a look at an imaginary character’s history:

You can see that in August, our model thinks that he is about to betray us (buy the product) based on his recent pattern of activity. But despite our expectations, he served loyally for months. Of course, he did eventually betray us. Since someone’s internal state (whether they’re ready to betray) can change over time, our model needs to predict whether someone is about to betray us so we know exactly when to reach out to them.

Model Evaluation Considerations: Scoring and Re-Scoring Over a Time Series

In order to know whether our model accurately reflects characters’ motives, each character should always have a score attached to them — our estimate of how trustworthy they are — and that score changes over time. This of course makes our evaluation very complex, since whether we are thinking of a character as “good” or “evil” will change over time, just like their own motives.

Another issue can occur when a score peaks for a while before leveling back off. To mitigate misleading forecasts that might cause us to temporarily mistrust a perfectly loyal character, we need to ensure that our model evaluation function looks at all the scores over time. We should penalize these mistaken scores when we retrain the model, and look at them to judge which models are better than others.

To evaluate a model, we’ll just consider the score we assigned to a character every time we scored (every day or every week), and see how well it predicts their actions in, say, the next week. If at the beginning of the week we said a character was likely to betray us, and they betrayed us on that Thursday, that’s a true positive and a victory for our model. If they didn’t betray us until next Thursday, though, we’d consider that a false positive — our model said they would betray us this week, and they didn’t. In that case we’ll also look at the score we gave them the following week.

Conclusions: What Did We Learn from the Starks?

This fictional example gives you a glimpse into how much thought and expertise should go into evaluating behavior models and coming up with the right metrics to determine the accuracy of their resulting scores. When doing machine learning over a time series, it is especially important to monitor your models and watch for drift. Keep in mind that a model could end up having multiple “false positives” associated with the same character from week to week (i.e. if it kept incorrectly predicting betrayals that didn’t happen), and this would be a clear indication that it’s time for a model refresh.

If you address all of the factors covered above, behavior scoring can be extremely useful for a wide variety of business needs. Knowing when people are going to do something (as opposed to just the open ended inevitability) is a key to predictive success.

Like this article? Subscribe to our Weekly Newsletter so you never miss out.

Follow @DataconomyMedia