Data Science Summer School 2016 – Airbrb: Predicting Loyalty


August 4, 2016


Jake Hofman, Louise Lai, Kaciny Calixte, Jacqueline Curran, Erica Ram


Microsoft Research in New York City, SUNY Old Westbury, Manhattan College, Adelphi University


The advent of the sharing economy has redefined the way firms do business. Airbnb has led this revolution. With a valuation of $25 billion, it has become the world’s third most valued startup and has more rooms than the world’s largest hotel chain. Historically, customer loyalty was based on experience with a particular firm – now, for the sharing economy, it is based on experiences with many individuals. With this in mind, we chose to use the Inside Airbnb dataset to further investigate the evolving idea of loyalty. Airbnb has both hosts and guests as customers. Host loyalty is defined as a host renting out their places consistently, and guest loyalty as guests returning frequently to stay in a place listed on Airbnb. We used decision trees to look at both the loyalty of the hosts and the guests. No matter the industry, market experts stand by measures of recency frequency to predict loyalty. However, our model is able to improve upon this idea with added features, such as review text and amenities. The end result is a model that successfully predicts the the return rate of hosts and guests to Airbnb with a high level of accuracy.