University of Maryland, College Park
All around the world, businesses and organizations are becoming increasingly data driven, products and services are built more and more around intelligence derived from data, and the need for reliable and efficient data storage and processing at a global scale is becoming even more critical. Modern data infrastructure architectures have emerged from years of evolution in analytical and transactional data systems, along with a continuous infusion of capabilities stemming from new use cases and new data processing paradigms. Tightly coupled data warehouses are being replaced by more flexible ecosystems built around low-cost globally available storage and open file formats; data science and machine learning workloads are increasingly sharing the same infrastructure as analytical workloads; transactional systems and key-value stores are exploring ways to preserve consistency, reliability, and performance while operating efficiently at global scale. Yet, despite all these efforts and progress, many challenges still remain as the data management community is seeking out the defining characteristics of next-generation data infrastructure.
Facebook has had a long history of making contributions to the data management space – Hive, Presto, RocksDB, MyRocks all being examples of innovative work that started within the company. The scale at which we run and the unique constraints of our workloads make many existing solutions infeasible and provide a perspective that leads to new ideas. As we continue to build and evolve our data infrastructure, we are focused on a number of problems. These range from techniques to optimize CPU usage (and thus power consumption during large scale query processing) to strategies to optimize physical layouts and data transfer bandwidth, and from techniques to address the challenges rising from data storage and processing across widely separated data centers to novel approaches in converging data wrangling, machine learning, and analytics. Since guaranteeing correctness is a key requirement for our data storage and processing systems, we also remain focused in systems for testing and verification. Despite the unique constraints of our workloads, a lot of these problems are common in the industry and we believe that there is a lot to be gained by collaborating with academia in this area.
To foster further innovation in this area, and to deepen our collaboration with academia, Facebook is pleased to invite faculty to respond to this call for research proposals pertaining to the aforementioned topics. We anticipate awarding a total of 10 awards, each in the $50,000 range. Payment will be made to the proposer’s host university as an unrestricted gift. In addition, PIs and Co-PIs on the winning proposals will be automatically granted access to CrowdTangle, a public insights tool from Facebook that makes it easy to follow, analyze, and report on what’s happening with public content on social media. Learn more about CrowdTangle here.
University of Maryland, College Park
Swiss Federal Institute of Technology Lausanne
The Ohio State University
University of California, Berkeley
University of Chicago
ETH Zurich
University of Wisconsin–Madison
Massachusetts Institute of Technology
Technische Universität Dresden
University of California, Irvine
Applications Are Currently CLosed
Areas of interest include, but are not limited to, the following:
Data processing at scale imposes substantial CPU and power challenges to Facebook’s data centers. We are interested in techniques that can optimize the usage of CPU during common data processing pipelines, including, but not limited to the following:
Large scale decoupled data systems make heavy use of IO when transferring data from storage to compute nodes, and from permanent media to main memory. We are looking for innovative strategies and techniques that can reduce the amount of data transferred during data processing pipelines, including but not limited to the following:
Data storage and processing across widely separated data centers presents a different set of challenges. We are interested in techniques that look to address problems caused by increased latency, resource constraints such as network bottlenecks, and heterogeneous hardware. Areas include but not limited to the following:
Decoupling compute from storage and using low-cost storage based on open file formats to store from raw to fully curated data and for a wide variety of use cases has led to the need to rethink many areas of data management, including but not limited to the following:
Guaranteeing correctness is a key requirement for our data storage and processing systems. We are looking for advances in systems to test and verify that these systems perform correctly and within spec when change (e.g., new code, faults, new hardware) is introduced. Areas include but not limited to the following:
Most of the RFP awards are an unrestricted gift. Because of its nature, salary/headcount could be included as part of the budget presented for the RFP. Since the award/gift is paid to the university, they will be able to allocate the funds to that winning project and have the freedom to use as they need. All Facebook teams are different and have different expectations concerning deliverables, timing, etc. Long story short – yes, money for salary/headcount can be included. It’s up to the reviewing team to determine if the percentage spend is reasonable and how that relates to the decision if the project is a winner or not.
We are flexible, but ideally proposals submitted are single-spaced, Times New Roman, 12 pt font.
Research awards are given year-round and funding years/duration can vary by proposal.
Yes, award funds can be used to cover a researcher’s salary.
Budgets can vary by institution and geography, but overall research funds ideally cover the following: graduate or post-graduate students’ employment/tuition; other research costs (e.g., equipment, laptops, incidental costs); travel associated with the research (conferences, workshops, summits, etc.); overhead for research gifts is limited to 5%
One person will need to be the primary PI (i.e., the submitter that will receive all email notifications); however, you’ll be given the opportunity to list collaborators/co-PIs in the submission form. Please note in your budget breakdown how the funds should be dispersed amongst PIs.
Facebook’s decisions will be final in all matters relating to Facebook RFP solicitations, including whether or not to grant an award and the interpretation of Facebook RFP Terms and Conditions. By submitting a proposal, applicants affirm that they have read and agree to these Terms and Conditions.