Forecasting occupancy in hospitality business with autoregressive time-series models does not intercept occasional impact of public events. Our goal was to find appropriate datasets and enrich existing predictive models to account for rare and explicable demand surges. The paper proposes processing framework: data source types and formats, and forecast algorithms based on natural language processing. The study shows that classical models using word collocations outperform state of the art deep neural networks. Also, the collocations that turn out to be important, occupy certain locations in a graph that represents the natural language. The findings may result in yet improved forecasts, leading to smarter offer pricing and, finally, increased competitiveness in hospitality business. They may also serve public interest in areas like parking management or public transport planning. INDEX TERMS Time series analysis, recurrent neural networks, regression analysis, natural language processing, predictive models.