Explore the latest research from Meta

All Publications

31 March 2021
Jie M. Zhang, Mark Harman

Machine learning software can be unfair when making human-related decisions, having prejudices over certain groups of people. Existing work primarily focuses on proposing fairness metrics and presenting fairness improvement approaches. It remains unclear how key aspect of any machine learning system, such as feature set and training data, affect fairness. This paper presents results from a comprehensive study that addresses this problem.

31 March 2021
Wei Ma, Thierry Titcheu Chekam, Mike Papadakis, Mark Harman

We introduce MuDelta, an approach that identifies commit-relevant mutants; mutants that affect and are affected by the changed program behaviors. Our approach uses machine learning applied on a combined scheme of graph and vector-based representations of static code features. Our results, from 50 commits in 21 Coreutils programs, demonstrate a strong prediction ability of our approach; yielding 0.80 (ROC) and 0.50 (PR-Curve) AUC values with 0.63 and 0.32 precision and recall values.

31 March 2021
Giovani Guizzo, Justyna Petke, Federica Sarro, Mark Harman

Genetic improvement uses artificial intelligence to automatically improve software with respect to non-functional properties (AI for SE). In this paper, we propose the use of existing software engineering best practice to enhance Genetic Improvement (SE for AI). We conjecture that existing Regression Test Selection (RTS) techniques (which have been proven to be efficient and effective) can and should be used as a core component of the GI search process for maximizing its effectiveness.

6 December 2020
Theofanis Karaletsos, Thang D. Bui

This paper introduces two innovations: (i) a Gaussian process-based hierarchical model for network weights based on unit priors that can flexibly encode correlated weight structures, and (ii) input-dependent versions of these weight priors that can provide convenient ways to regularize the function space through the use of kernels defined on contextual inputs.

23 November 2020
José Cambronero, Hongyu Li, Seohyun Kim, Koushik Sen, Satish Chandra

The goal of this supervision is to produce embeddings that are more similar for a query and the corresponding desired code snippet. Clearly, there are choices in whether to use supervised techniques at all, and if one does, what sort of network and training to use for supervision. This paper is the first to evaluate these choices systematically.

Areas
23 November 2020
Mateusz Machalica, Alex Samylkin, Meredith Porth, Satish Chandra

Change-based testing is a key component of continuous integration at Facebook. However, a large number of tests coupled with a high rate of changes committed to our monolithic repository make it infeasible to run all potentially impacted tests on each change. We propose a new predictive test selection strategy which selects a subset of tests to exercise for each change submitted to the continuous integration system.

Areas
23 November 2020
Jie Zhang, Lingming Zhang, Mark Harman, Dan Hao, Yue Jia, Lu Zhang

In mutation testing, a large number of mutants may be generated and need to be executed against the test suite under evaluation to check how many mutants the test suite is able to detect, as well as the kind of mutants that the current test suite fails to detect. Consequently, although highly effective, mutation testing is widely recognized to be also computationally expensive, inhibiting wider uptake. To alleviate this efficiency concern, we propose Predictive Mutation Testing (PMT): the first approach to predicting mutation testing results without executing mutants.

Areas
17 October 2020
Sourabh Kulkarni, Kinjal Divesh Shah, Nimar Arora, Xiaoyan Wang, Yucen Lily Li, Nazanin Khosravani Tehrani, Michael Tingley, David Noursi, Narjes Torabi, Sepehr Akhavan Masouleh, Eric Lippert, Erik Meijer

We introduce PPL Bench, a new benchmark for evaluating Probabilistic Programming Languages (PPLs) on a variety of statistical models. The benchmark includes data generation and evaluation code for a number of models as well as implementations in some common PPLs.

17 July 2020
Jessica Ai, Beliz Gokkaya, Ilknur Kaynar Kabul, Audrey Flower, Ehsan Emamjomeh-Zadeh, Hannah Li, Li Chen, Neamah Hussein, Ousmane Dia, Sevi Baltaoglu, Erik Meijer

Characterizing the confidence of machine learning predictions unlocks models that know when they do not know. In this study, we propose a framework for assessing the quality of predictive distributions obtained using deep learning models.