On April 19, 2021, Facebook launched a request for proposals (RFP) on next-generation data infrastructure. With this RFP, which closes on June 2, 2021, the Facebook Core Data and Data Infra teams hope to deepen their ties to the academic research community by seeking out innovative solutions to the challenges that still remain in the data management community. To provide an inside look from the teams behind the RFP, we reached out to Stavros Harizopoulos and Shrikanth Shankar, who are leading the effort within their respective teams.
Shankar is a Director of Engineering on the Core Data team, which builds and supports the online data serving stack for Facebook, providing the databases, caches, and worldwide distribution that power Facebook, Instagram, WhatsApp, and more. Harizopoulos is a Software Engineer within Data Infrastructure, which delivers efficient platforms and end-user tools for the collection, management, and analysis of Facebook data. In this Q&A, Shankar and Harizopoulos contextualize the RFP by providing more background to database research at Facebook. They also discuss what inspired this RFP and where people can stay updated about what their teams are up to.
Q: What does database research look like at Facebook, and how has it evolved over the years?
A: Facebook has had a long history of making contributions to the database space — Hive, Presto, RocksDB, and MyRocks all being examples of innovative work that started within the company. The scale we run at and the unique constraints of our workloads make many existing solutions infeasible and provide a perspective that leads to new ideas. This has become increasingly true over the years as the company has grown and new challenges associated with this scale have shown up. We aspire to continue our tradition of building new, innovative database technologies.
Q: What’s the goal of this RFP?
A: As businesses and organizations become increasingly data driven and products and services are further built around intelligence derived from data, the need for highly reliable, flexible, and efficient data infrastructure becomes even more important. Modern data infrastructure architectures inherit from decades of database research, but recent trends and developments, such as the decoupling of compute and storage and the need to operate efficiently at global scale, as well as the emergence of new use cases such as data science and machine learning workloads, pose new challenges and opportunities.
With this RFP, we seek out innovative approaches to a number of problems that have the potential to set the defining characteristics of next-generation data infrastructure. Many of these problems are not unique to Facebook, and we are keen to learn about the great research done in this area as well as to strengthen our relations with academia.
Q: How does this RFP fit into the bigger picture for database research at Facebook?
A: Defining the underpinnings of data infrastructure that is reliable, resilient, flexible, efficient, and performant at global scale is at the core of database research at Facebook. Our research efforts, however, extend to several directions along modeling, managing, and visualizing different types of data, ranging from structured data to machine-generated logs and time-series data. We innovate in areas such as data storage and indexing, query processing, data modeling, transaction processing, and distributed systems, as well as novel approaches to privacy and security in data management.
Q: What inspired this RFP?
A: While we share our experiences by writing papers and publishing them and, in turn, we benefit from all the innovation in the database space, we’ve seen a couple of ways we could be making this better. Concretely, we’ve seen that certain areas may not be perceived externally as being impactful or important even when they are critical for us. On our side, we recognize that the solutions we have in place or are considering may be limited by our specific systems and the history behind them. We began this RFP process as a way for us to collaborate with academia by highlighting specific problems and looking for innovative approaches that tackle these issues.
Q: Where can people stay updated and learn more?
A: We actively participate each year in major database conferences, such as ICDE, SIGMOD, VLDB, and CIDR. This is where the academic community can reach out to us with questions and ideas. We also contribute a lot of our work through open source. Here are some examples:
—
Applications for the Next-Generation Data Infrastructure RFP close on June 2, 2021, and winners will be announced the following month. To receive updates about new research award opportunities and deadline notifications, subscribe to our RFP email list.