Churn Prediction and Prevention in High Value Players

Many approaches to reduce churn are standard operating procedure in consumer oriented industries such as mobile telephony, credit cards and retailers.  These include  customer lifecycle management coupled with churn management techniques such as loyalty points and free gifts.  

Retention is critical in many free to play games.  Another way to say this is that developers need to reduce player churn.  So it may seem odd that  not many churn reduction approaches have been documented within the games industry.  This blog post will review a paper presented by the team at Wooga regarding one of their attempts at churn reduction.  The authors attempted to reduce churn by selectively offering certain players free in-game currency as a means to entice them to continue playing.   This work was done in collaboration with the Artificial Intelligence Lab at École Polytechnique Fédérale de Lausanne.

Setup
The paper details the authors' attempt to predict and reduce churn for the games "Diamond Dash" and "Monster World".  Moreover they focused in on active high value players, defined as players in the top 10% by revenue contribution within the past 30 days, who has played the game once in the last 14 days.  Churn is defined as not logging in for the next 14 days.   

Predicting churn
The Diamond Dash dataset included 10736 active high value players of which 1821 churned, giving a churn rate of 17%.  For Monster World it was 7709 active high value players of which 352 churned, giving a churn rate of 4.5%  The authors tried different prediction techniques including Support Vector Machines, logistic regression, decision trees and neural networks.  

Taking action on the predictions
Giving away free currency was the intervention to reduce the churn of the active high value players.  The aim was to perform a one time gifting to entice players to continue playing.  Active players were randomly assigned to 3 groups A (40%), B (40%) and C (20%).  The actions on the three groups were as follows :
  - Group A high value players received notifications via Facebook email and notifications about the free currency, AFTER 14 days of inactivity had already occurred.  This is essentially what's commonly known in CRM loyalty circles as a "win-back" approach.
  - Group B high value players who were predicted to churn were sent Facebook email and notifications about the free currency, BEFORE they actually churned.  We can call this the predictive churn management approach.
  - Group C was the control group with no action taken.  

Results
The prediction exercise yielded good results.  While there were minor differences in the performance, all the methods used produced good AUC ranging from 0.76 to 0.93, ie. they were all good enough to take action on.  The final variables, known in the machine learning world as "features", used in prediction varied between the two games : Diamond Dash used the time series of rounds played, accuracy, invites sent, days in game, last purchase, days since last purchase.  Monster World used the time series of logins, level, in-game-currency 1 balance (WooGoo), currency 2 balance (Magic Wands).  

To compare the impact of the actions on group A and B with the control group C, the authors measured the churn rates, click rates and daily revenues of the three groups.   The win-back group A saw churn rates increasing from 7.8% to 11.06%, an absolute increase of 3.26% or a relative increase of 42% (3.26% / 7.8%).  The predictive churn management group B saw churn rates increase from 8.4% to 11.39%, an absolute increase of 3% or a relative increase of 36% (3% / 8.4%).  The control group C saw churn rates increase from 6.77% to 10.29%, an absolute increase of 3.5% or a relative increase of 52% (3.5% / 6.77% ).  The final churn rates of the groups were compared using chi-squared tests and the p-values for A-C and B-C comparison were 0.57 and 0.41  The respective chi-squared statistic were 0.3196 and 0.667 respectively.

The click rates were markedly higher for predictive churn management group B versus the winback group A.  10.95% versus 2.16% for email and 17.86 versus 4.06% for facebook notifications.  

Daily revenue for the groups were not provided due to security concerns, however the authors performed T-tests comparing the means of each win-back versus control and predictive churn versus control.  The T-statistic and p-values are 1.0067 and 0.29, and 0.7767 and 0.44 respectively.  

Discussion

One key takeaway here is that churn is a user behavior that can be predicted fairly accurately.  Interestingly the final predictive features used in different games were different.  One of the issues in applying predictive analysis to games is the timing consuming aspect of determining which features to use.  In machine learning circles this is known as feature engineering.     

While the final churn rates of the different groups were not very different from one another as evidenced by the high p-values of the chi-squared tests, what's interesting to consider is whether the predictive churn management actually mitigated the "natural" churn that would have taken place.  One needs to look at how the churn rates changed for each group.  Recall that the control group saw churn rates increase by 52%, the win-back group by 42% and the predictive churn group by 36%.  One way to interpret this is that the predictive churn group saw 30% reduction in "new" churn compared to the control group.

How much is this 30% reduction worth?  One way to answer this question is to extrapolate the effect to the entire population of active high value players.  In other words, how many active high value players would be "saved" by this predictive churn management?  To do this, we need the size of the population of active high value players.  The authors were careful not to mention the group sizes as that information is proprietary.  Since the churn prediction training data for Monster World had different churn rates from the actual A-B test, we cannot use the 7709 active high value players in the modeling phase as the population.  However based on the chi-squared statistics and assuming the high value players were distributed in the groups A, B and C with 40%,40%,20% weights, then we can estimate that there were approximately 6500 active high value players.  If we assume the base churn rate before intervention was 7.8% (taken from group A), then a 30% reduction in "new" churn would save approximately 81 active high value players from churning.  For the curious, the formula is 7.8% * (0.52-0.36) * 6500.  

Since the A-B test involved giving away free currency, we would expect that the players in group B would have lower immediate revenue as they can forego their usual purchases for a short time.  Unfortunately, most revenue-per-user distributions are not normal and so defy many hypothesis tests.  As such the authors had to compare the daily revenues instead.  It would have been useful to compare the revenues from the high value players from each group as the interventions were only targeted to the high value players.  

Higher "action response" rates for predictive churn management group B versus win-back group A shows that it is easier to reach the user while she is still engaged in the game.  While the message may not indicate a large immediate impact in daily revenue, it improves the user's perception of the game in that she is being proactively rewarded by the game developer.  It would be interesting to compare the overall 30 day revenues of group A and B after the test period.

The method of action here is to send emails and notifications indicating to the players that they have received free in-game currency.  Since these are already payers who are the top 10% spenders in the game, giving them USD10 worth of free in-game currency might represent a small portion of "savings" to them.  Also, since these players were willing to spend a lot of money already, this free in-game currency did not have contextual meaning to them.  In the language of the loyalty marketing crowd, the USD10 free in-game currency did not engender loyalty because the gift is not viewed as meaningful.  As an example, after using Pampers diapers for the first 6 months of my son's life, we received a Pampers reward coupon for a free 20 page photo book.  Besides the high value of the loyalty gift (USD29), the photo book is also very relevant to my current circumstances.  A possible improvement to the intervention is scarcity,  giving items that are rare and relevant to their particular stage in the game.  

In the loyalty CRM world, programs are put in place that emphasize an ongoing relationship with the user.  The results of a single A-B test is directionally interesting, but cannot really approximate a comprehensive longer term communications strategy that minimizes unwanted messages while reducing churn.

This paper focused extensively on high value players, ie. players who spent a lot of money.  Preventing churn in this group is commendable; but there is greater opportunity in the 90% of players that do not spend money.   Making a dent in retaining non-paying users, which might lead to more paying users, would provide a bigger win.  

There are many different other interventions that can be employed once the player level predictions are available.  Migrating likely churn users to a different game in the same family was suggested by the authors.  Others include targeting the non-churn high value players for more promotions, changing the game dynamics for likely churners etc.

References

Churn Prediction for High-Value Players in Casual Social Games.  Julian Runge, Peng Gao, Florent Garcin, Boi Faltings. IEEE Conference on Computational Intelligence and Games 2014.  View paper.

Previously published at Gamasutra Blog Site