Learning Efficient Interpretable Policies on Experimental Data

Conference on Digital Experimentation (CODE)

Abstract

Unlike conventional A/B testing where users are randomly assigned to receive a treatment, internet companies are increasingly using machine learning models for personalized experimentation to create personalized policies. Such policies assign, for each user, the best predicted treatment for that user.

A popular approach for personalized experimentation is to train heterogeneous treatment effect (HTE) models, and then assign the treatment group that led to the highest predicted treatment effect for that user. However, there are several challenges with this approach. Firstly, when there are multiple (not just binary) outcomes, the creation of a policy is no longer as trivial as assigning the treatment group that maximizes a single treatment effect. Secondly, as dataset size and the number of features used for HTE modeling increases, the cost of maintaining such models in a production environment increases. Finally, the black-box nature of popular HTE models, that may consist of uninterpretable base learners such as gradient boosted decision trees, but are also combined in different ways to generate the treatment effect prediction, may deter uptake of such models, especially for critical applications.

Here we focus on learning interpretable policies based on if-else rules of the user features. Figure 1 provides an example of an interpretable policy with three treatment groups, using only two features – age and gender. Such a policy is easy to implement in a production environment and avoids the need to maintain a HTE model.

There are different ways to generate interpretable policies. We briefly summarize some of these approaches and highlight how the methods we propose present a contribution. Existing works  construct tree policies based on counterfactual estimation. However they both focus on constructing exactly optimal trees, which is prohibitively slow in a large scale. We consider methods that involve distilling black-box heterogeneous treatment effect models and methods that modify interpretable models such as decision trees to find segments with elevated treatment effects. We also propose two new approaches to ensemble multiple interpretable policies while still remaining interpretable. Inline with all the approaches, we propose work on the general setting of more than two treatment groups and more than a single outcome, which is needed for real-world settings.

Latest Publications

Log-structured Protocols in Delos

Mahesh Balakrishnan, Mihir Dharamshi, David Geraghty, Santosh Ghosh, Filip Gruszczynski, Jun Li, Jingming Liu, Suyog Mapara, Rajeev Nagar, Ivailo Nedelchev, Francois Richard, Chen Shen, Yee Jiun Song, Rounak Tibrewal, Vidhya Venkat, Ahmed Yossef, Ali Zaveri

SOSP