September 24, 2021

Announcing the winners of the 2021 Next-generation Data Infrastructure request for proposals

By: Meta Research

In April 2021, Facebook launched the Next-generation Data Infrastructure request for proposals (RFP). Today, we’re announcing the winners of this award.


The Facebook Core Data and Data Infra teams were interested in proposals that sought out innovative solutions to the challenges that still remain in the data management community. Areas of interest included, but were not limited to, the following topics:

  • Large-scale query processing
  • Physical layout and IO optimizations
  • Data management and processing at a global scale
  • Converged architectures for data wrangling, machine learning, and analytics
  • Advances in testing and verification for storage and processing systems

Read our Q&A with database researchers Stavros Harizopoulos and Shrikanth Shankar to learn more about database research at Facebook, the goal of this RFP, and the inspiration behind the RFP.

The team reviewed 109 high-quality proposals, and we are pleased to announce the 10 winning proposals and six finalists. Thank you to everyone who took the time to submit a proposal, and congratulations to the winners.

Research award recipients

Holistic optimization for parallel query processing

Paraschos Koutris (University of Wisconsin–Madison)

SCALER – SCalAbLe vEctor pRocessing of SPJG-Queries

Wolfgang Lehner, Dirk Habich (Technische Universität Dresden)

AnyScale transactions in the cloud

Natacha Crooks, Joe Hellerstein (University of California, Berkeley)

Proudi: Predictability on unpredictable data infrastructure

Haryadi S. Gunawi (University of Chicago)

Making irregular partitioning practical

Spyros Blanas (The Ohio State University)

Dynamic join processing pushdown in Presto

Daniel Abadi, Chujun Song (University of Maryland, College Park)

A learned persistent key-value store

Tim Kraska (Massachusetts Institute of Technology)

Building global-scale systems using a flexible consensus substrate

Faisal Nawab (University of California, Irvine)

Runtime-optimized analytics using compilation hints

Anastasia Ailamaki (Swiss Federal Institute of Technology Lausanne)

Flexible scheduling for machine learning data processing close to storage

Ana Klimovic, Damien Aymon (ETH Zurich)


Next generation data provenance/data governance

Tim Kraska, Michael Cafarella, Michael Stonebraker (Massachusetts Institute of Technology)

Optimizing commitment latency for geo-distributed transactions

Xiangyao Yu (University of Wisconsin–Madison)

Semantic optimization of recursive queries

Dan Suciu (University of Washington)

Towards a disaggregated database for future data centers

Jianguo Wang (Purdue University)

Unified data systems for structured and unstructured data

Matei Zaharia, Christos Kozyrakis (Stanford University)

Unifying machine learning and analytics under a single data engine

Stratos Idreos (Harvard University)