Publications - Meta Research

Research from Meta

All Publications

March 31, 2021

Giovani Guizzo, Justyna Petke, Federica Sarro, Mark Harman

Enhancing Genetic Improvement of Software with Regression Test Selection

Genetic improvement uses artificial intelligence to automatically improve software with respect to non-functional properties (AI for SE). In this paper, we propose the use of existing software engineering best practice to enhance Genetic Improvement (SE for AI). We conjecture that existing Regression Test Selection (RTS) techniques (which have been proven to be efficient and effective) can and should be used as a core component of the GI search process for maximizing its effectiveness.

Areas

Machine Learning, Systems & Infrastructure,

March 31, 2021

Wei Ma, Thierry Titcheu Chekam, Mike Papadakis, Mark Harman

MuDelta: Delta-Oriented Mutation Testing at Commit Time

We introduce MuDelta, an approach that identifies commit-relevant mutants; mutants that affect and are affected by the changed program behaviors. Our approach uses machine learning applied on a combined scheme of graph and vector-based representations of static code features. Our results, from 50 commits in 21 Coreutils programs, demonstrate a strong prediction ability of our approach; yielding 0.80 (ROC) and 0.50 (PR-Curve) AUC values with 0.63 and 0.32 precision and recall values.

Areas

Machine Learning, Systems & Infrastructure,

March 31, 2021

Jie M. Zhang, Mark Harman

“Ignorance and Prejudice” in Software Fairness

It remains unclear how key aspect of any machine learning system, such as feature set and training data, affect fairness. This paper presents results from a comprehensive study that...

Areas

Machine Learning, Systems & Infrastructure,

February 17, 2021

Jie M. Zhang, Mark Harman, Lei Ma, Yang Liu

Machine Learning Testing: Survey, Landscapes and Horizons

This paper provides a comprehensive survey of techniques for testing machine learning systems; Machine Learning Testing (ML testing) research.

Areas

Artificial Intelligence, Machine Learning, Systems & Infrastructure,

December 6, 2020

Thang D. Bui

Hierarchical Gaussian Process Priors for Bayesian Neural Network Weights

This paper introduces two innovations: (i) a Gaussian process-based hierarchical model for network weights based on unit priors that can flexibly encode correlated weight structures, and (ii) input-dependent versions of these weight priors that can provide convenient ways to regularize the function space through the use of kernels defined on contextual inputs.

Areas

Artificial Intelligence, Machine Learning,

November 23, 2020

José Cambronero, Hongyu Li, Seohyun Kim, Koushik Sen, Satish Chandra

When Deep Learning Met Code Search

The goal of this supervision is to produce embeddings that are more similar for a query and the corresponding desired code snippet. Clearly, there are choices in whether to use supervised techniques at all, and if one does, what sort of network and training to use for supervision. This paper is the first to evaluate these choices systematically.

Areas

November 23, 2020

Jie Zhang, Lingming Zhang, Mark Harman, Dan Hao, Yue Jia, Lu Zhang

Predictive Mutation Testing

In mutation testing, a large number of mutants may be generated and need to be executed against the test suite under evaluation to check how many mutants the test suite is able to detect, as well as the kind of mutants that the current test suite fails to detect. Consequently, although highly effective, mutation testing is widely recognized to be also computationally expensive, inhibiting wider uptake. To alleviate this efficiency concern, we propose Predictive Mutation Testing (PMT): the first approach to predicting mutation testing results without executing mutants.

Areas

November 23, 2020

Mateusz Machalica, Alex Samylkin, Meredith Porth, Satish Chandra

Predictive Test Selection

Change-based testing is a key component of continuous integration at Facebook. However, a large number of tests coupled with a high rate of changes committed to our monolithic repository make it infeasible to run all potentially impacted tests on each change. We propose a new predictive test selection strategy which selects a subset of tests to exercise for each change submitted to the continuous integration system.

Areas

October 17, 2020

Sourabh Kulkarni, Kinjal Divesh Shah, Nimar Arora, Xiaoyan Wang, Yucen Lily Li, Nazanin Khosravani Tehrani, Michael Tingley, David Noursi, Narjes Torabi, Sepehr Akhavan Masouleh, Erik Meijer

PPL Bench: Evaluation Framework For Probabilistic Programming Languages

We introduce PPL Bench, a new benchmark for evaluating Probabilistic Programming Languages (PPLs) on a variety of statistical models. The benchmark includes data generation and evaluation code for a number of models as well as implementations in some common PPLs.

Areas

Artificial Intelligence, Machine Learning,

July 17, 2020

Jessica Ai, Beliz Gokkaya, Ilknur Kaynar Kabul, Audrey Flower, Ehsan Emamjomeh-Zadeh, Hannah Li, Li Chen, Neamah Hussein, Ousmane Dia, Sevi Baltaoglu, Erik Meijer

A Simulation-based Framework for Characterizing Predictive Distributions for Deep Learning

Characterizing the confidence of machine learning predictions unlocks models that know when they do not know. In this study, we propose a framework for assessing the quality of predictive distributions obtained using deep learning models.

Areas

Machine Learning