Enabling bookings cancellation prediction with data science



Booking cancellations severely impacts demand-management decisions by limiting the production of accurate forecasts, a critical tool for revenue management performance. To soften limitations, hotels implement rigid cancellation policies and overbooking strategies (Smith, Parsa, Bujisic, & van der Rest, 2015; Talluri & Van Ryzin, 2005), which later can have a negative impact on revenue, on social reputation, and damage the hotel business performance. Most of the studies on the prediction of booking cancellations view it as a regression problem (to forecast the total number of cancellations) and not as a classification problem (predict which bookings are likely to cancel) (Antonio, Almeida, & Nunes, 2017, 2016). Although Morales & Wang (2010) stated that “it is hard to imagine that one can predict whether a booking will be cancelled or not with high accuracy” (p. 556), with the application of Data Science tools like machine learning, statistics, data mining and data visualization, we can now demonstrate that this assertion is no longer valid. Using data from four hotels’ Property Management Systems (PMS), this study shows that it is possible to build models that predict, with high accuracy, which bookings are likely to be cancelled and, with that, calculate the net demand for each future date in a research environment. Moreover, that it is possible to implement it in a production environment. After deploying a working prototype in two hotels, the preliminary results demonstrate its viability for real work environment applications and its importance as a valuable tool for room pricing and inventory allocation optimization decisions.


Data science, Hospitality industry, Feature Engineering, Machine Learning Protoptype


Data Science


4th World Research Summit for Tourism and Hospitality , December 2017

Cited by

No citations found