Experts in the research fields of data science, data mining, knowledge discovery, large-scale data analytics, and big data are gathering in Anchorage, Alaska, this week for the 25th annual conference on Knowledge Discovery and Data Mining (KDD) to present the latest interdisciplinary advances in these fields.
Research from Facebook will be presented in oral talks and poster sessions. Facebook researchers and engineers will also be organizing and participating in workshops throughout the week.
Yina Tang, Fedor Borisyuk, Siddarth Malreddy, Yixuan Li, Yiqun Liu, and Sergey Kirshner
In this paper we present a deployed image recognition system used in a large scale commerce search engine, which we call MSURU. It is designed to process product images uploaded daily to Facebook Marketplace. Social commerce is a growing area within Facebook and understanding visual representations of product content is important for search and recommendation applications on Marketplace. In this paper, we present techniques we used to develop efficient large-scale image classifiers using weakly supervised search log data. We perform extensive evaluation of presented techniques, explain practical experience of developing large-scale classification systems and discuss challenges we faced. Our system, MSURU, outperformed current state-of-the-art system developed at Facebook  by 16% in e-commerce domain. MSURU is deployed to production with significant improvements in search success rate and active interactions on Facebook Marketplace.
Drew Dimmery, Eytan Bakshy, and Jasjeet Sekhon
We develop and analyze empirical Bayes Stein-type estimators for use in the estimation of causal effects in large-scale online experiments. While online experiments are generally thought to be distinguished by their large sample size, we focus on the multiplicity of treatment groups. The typical analysis practice is to use simple differences-in-means (perhaps with covariate adjustment) as if all treatment arms were independent. In this work we develop consistent, small bias, shrinkage estimators for this setting. In addition to achieving lower mean squared error these estimators retain important frequentist properties such as coverage under most reasonable scenarios. Modern sequential methods of experimentation and optimization such as multi-armed bandit optimization (where treatment allocations adapt over time to prior responses) benefit from the use of our shrinkage estimators. Exploration under empirical Bayes focuses more efficiently on near-optimal arms, improving the resulting decisions made under uncertainty. We demonstrate these properties by examining seventeen large-scale experiments conducted on Facebook from April to June 2017.
Organizers: Shobeir Fakhraeim, Aude Hofleitner, Danai Koutra, Julian McAuley, Bryan Perozzi, Tim Weninger
Invited talk: Reflections of Social Networks
Lada Adamic, invited speaker