Leveraging Test Plan Quality to Improve Code Review Efficacy

ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE)


In modern code reviews, many artifacts play roles in knowledge-sharing and documentation: summaries, test plans, and comments, etc. Improving developer tools and facilitating better code reviews require an understanding of the quality of pull requests and their artifacts. This is difficult to measure, however, because they are often free-form natural language and unstructured text data. In this paper, we focus on measuring the quality of test plans at Meta. Test plans are used as a communication mechanism between the author of a pull request and its reviewers, serving as walkthroughs to help confirm that the changed code is behaving as expected. We collected developer opinions on over 650 test plans from more than 500 Meta developers, then introduced a transformer-based model to leverage the success of natural language processing (NLP) techniques in the code review domain. In our study, we show that the learned model is able to capture the sentiment of developers and reflect a correlation of test plan quality with review engagement and reversions: compared to a decision tree model, our proposed transformer-based model achieves a 7% higher F1-score. Finally, we present a case study of how such a metric may be useful in experiments to inform improvements in developer tools and experiences.

Featured Publications