Code Quality Prediction Under Super Extreme Class Imbalance

IEEE International Symposium on Software Reliability Engineering (ISSRE)

Abstract

Predicting the quality of software in the early phases of the development life cycle has various benefits to an organization’s bottom line with wide applicability across industry and government. Yet, developing robust software quality prediction models in practice is a challenging task due to “super” extreme class imbalance. In this paper, we present our work on a code quality prediction framework, we call Automated Incremental Effort Investments (AIEI), to fasten the process of going from data to a performant model under super extreme class imbalance. Experiments on a large scale real-world dataset, from Meta Platforms, show that the proposed approach competes with or outperforms state-of-the art shallow and deep learning approaches. We evaluate the practical significance of the model predictions on test case prioritization efficiency, where AIEI achieves the top rank reducing code review time by 2.5% and test case resource utilization by 9.3%.

Featured Publications