Popularity Prediction for Social Media over Arbitrary Time Horizons

International Conference on Very Large Data Bases (VLDB)


Predicting the popularity of social media content in real time requires approaches that efficiently operate at global scale. Popularity prediction is important for many applications, including detection of harmful viral content to enable timely content moderation. The prediction task is difficult because views result from interactions between user interests, content features, resharing, feed ranking, and network structure. We consider the problem of accurately predicting popularity both at any given prediction time since a content item’s creation and for arbitrary time horizons into the future. In order to achieve high accuracy for different prediction time horizons, it is essential for models to use static features (of content and user) as well as observed popularity growth up to prediction time.

We propose a feature-based approach based on a self-excited Hawkes point process model, which involves prediction of the content’s popularity at one or more reference horizons in tandem with a point predictor of an effective growth parameter that reflects the timescale of popularity growth. This results in a highly scalable method for popularity prediction over arbitrary prediction time horizons that also achieves a high degree of accuracy, compared to several leading baselines, on a dataset of public page content on Facebook over a two-month period, covering billions of content views and hundreds of thousands of distinct content items. The model has shown competitive prediction accuracy against a strong baseline that consists of separately trained models for specific prediction time horizons.

Featured Publications