Publications - Meta Research

December 16, 2013

Léon Bottou, Jonas Peters, Joaquin Quiñonero Candela, Denis Charles, Max Chickering, Elon Portugaly, Dipankar Ray, Patrice Simard, Ed Snelson

Paper

Counterfactual Reasoning and Learning Systems: The Example of Computational Advertising

This work shows how to leverage causal inference to understand the behavior of complex learning systems interacting with their environment and predict the consequences of changes to the system. Such p...

Areas

Data Science, Machine Learning,

Paper

December 8, 2013

Ankur Gandhe, Long Qin, Florian Metze, Alexander Rudnicky, Ian Lane, Matthias Eck

Paper

Using Web Text to Improve Keyword Spotting in Speech

For low resource languages, collecting sufficient training data to build acoustic and language models is time consuming and often expensive. In this paper, we investigate the use of online text resour...

Areas

Machine Learning, Natural Language Processing & Speech,

Paper

November 4, 2013

Qi Huang, Ken Birman, Robbert van Renesse, Wyatt Lloyd, Sanjeev Kumar, Harry Li

Paper

An Analysis of Facebook Photo Caching

This paper examines the workload of Facebook’s photo-serving stack and the effectiveness of the many layers of caching it employs. Facebook’s image-management infrastructure is complex and geographically distributed. It includes browser caches on end-user systems, Edge Caches at ~20 PoPs, an Origin Cache, and for some kinds of images, additional caching via Akamai. The underlying image storage layer is widely distributed, and includes multiple data centers.

Areas

Systems & Infrastructure

Paper

October 1, 2013

Wenfei Wu, Guohui Wang, Aditya Akella, Anees Shaikh

Paper

Virtual Network Diagnosis as a Service

Today’s cloud network platforms allow tenants to construct sophisticated virtual network topologies among their VMs on a shared physical network infrastructure. However, these platforms provide little...

Areas

Systems & Infrastructure

Paper

August 27, 2013

Lior Abraham, John Allen, Oleksandr Barykin, Vinayak Borkar, Bhuwan Chopra, Ciprian Gerea, Dan Merl, Josh Metzler, David Reiss, Subbu Subramanian, Janet Wiener, Okay Zed

Paper

Scuba: Diving into Data at Facebook

Facebook takes performance monitoring seriously. Performance issues can impact over one billion users so we track thousands of servers, hundreds of PB of daily network traffic, hundreds of daily code...

Areas

Systems & Infrastructure

Paper

August 26, 2013

Mike Curtiss, Iain Becker, Tudor Bosman, Sergey Doroshenko, Lucian Adrian Grijincu, Tom Jackson, Sandhya Kunnatur, Soren Lassen, Philip Pronin, Sriram Sankar, Guanghao Shen, Gintaras Woss, Chao Yang, Ning Zhang

Paper

Unicorn: A System for Searching the Social Graph

Unicorn is an online, in-memory social graph-aware indexing system designed to search trillions of edges between tens of billions of users and entities on thousands of...

Areas

Systems & Infrastructure

Paper

August 26, 2013

Maheshwaran Sathiamoorthy, Megasthenis Asteris, Dimitris Papailiopoulos, Alexandros G. Dimakis, Ramkumar Vadali, Scott Chen, Dhruba Borthakur

Paper

XORing Elephants: Novel Erasure Codes for Big Data

Distributed storage systems for large clusters typically use replication to provide reliability. Recently, erasure codes have been used to reduce the large storage overhead of three-replicated systems. Reed-Solomon codes are the standard design choice and their high repair cost is often considered an unavoidable price to pay for high storage efficiency and high reliability.

Areas

Systems & Infrastructure

Paper

August 22, 2013

Qifan Wang, Dan Zhang, Luo Si

Paper

Weighted Hashing for Fast Large Scale Similarity Search

Similarity search, or finding approximate nearest neighbors, is an important technique for many applications. Many recent research demonstrate that hashing methods can achieve promising results for large scale similarity search due to its computational and memory efficiency.

Areas

Machine Learning

Paper

August 22, 2013

Xianglong Liu, Junfeng He, Bo Lang

Paper

Reciprocal Hash Tables for Nearest Neighbor Search

Recent years have witnessed the success of hashing techniques in approximate nearest neighbor search. In practice, multiple hash tables are usually employed to retrieve more desired results from all hit buckets of each table. However, there are rare works studying the unified approach to constructing multiple informative hash tables except the widely used random way.

Areas

Machine Learning

Paper

August 11, 2013

Johan Ugander, Brian Karrer, Lars Backstrom, Jon Kleinberg

Paper

Graph Cluster Randomization: Network Exposure to Multiple Universes

A drawback with A/B testing is that it is poorly suited for experiments involving social interference, when the treatment of individuals spills over to neighboring individuals along an underlying social network. In this work, we propose a novel methodology using graph clustering to analyze average treatment effects under social interference.

Areas

Data Science

Paper

Research

Research from Meta

All Publications