The rise of modern Web applications has seen a surge in the quantity of digital content — photos, and videos — stored, accessed, and transmitted by Internet portals. These service providers, such as Facebook and Netflix, have deployed large caching tiers within their content-serving stack to lessen the load on their backend systems and to decrease the latency of content fetching for the user. These caching infrastructures are often faced with non-traditional workload that relates to user activities and evolves along its applications, and extensively use modern flash devices due to their cost advantage (in terms of dollar per gigabyte) over DRAM as well as their significantly higher I/O performance than magnetic disks. Given such challenges, it is time to revisit the design of content caching solutions for the modern Web, and my research is driven in this direction.
Through a collaboration with Facebook, we conducted an analysis which explores the dynamics of the full Facebook photo-serving stack, from the client browser to Facebook’s photo storage server, looking both at the performance of each caching tier and at interactions between multiple system layers. This work helped us gain a tremendous amount of insights about the Facebook photo workload and potential methods to build a better caching stack, such as adopting more advanced caching algorithms. Implementing complicated algorithm on flash is non-trivial due to its unique performance characteristics, therefore we further designed and implemented a framework that can help realize many advanced caching algorithms on modern flash devices. This design is able to provide high algorithm fidelity, high throughput, and low device overhead at the same time.
Our Facebook photo-serving stack analysis has been published in SOSP, a top conference in the Computer Science system community, and our findings were well received at the venue. Industry-wise, a short version featured on the Facebook Engineering Blog got a lot of attention, but a more exciting impact is that this work has inspired Facebook to optimize their photo-serving stack with our findings. The design on flash device oriented caching framework is also being tested at Facebook as well.
Tremendously! At Cornell, we have had fruitful collaborations with Facebook due to the help offered by its engineers, managers, and academic experts. It also creates a unique platform to get myself connected with top scientists and engineers that share the Facebook ties. As a concrete example, two of my extremely smart co-authors were from Princeton and we never met each other until we shared the same internship experience at Facebook. Additionally, the generous stipend and travel budget gives me freedom to pursue riskier research topics and to advocate my work without caring the expense. During last winter, I traveled for a whole month internationally to present my work at 8 different institutions and companies, which was an exciting experience without a thrilling bill.
The stipend and travel budget helped my work financially, the internship experience facilitated my research by exposing myself to real problems and top brains, and the recognition as a Facebook fellow gained a great deal of attention to my work both inside Facebook and from general academia. This is the best package I would ever imagined, and the only question left in my mind at the end of my fellowship is that can this be longer than a year?
Well, the way I figured out to extend my ‘fellowship’ is to join Facebook as a Research Scientist this fall, after finishing my graduate studies! Research wise, there are more insights from the earlier analysis study that we have not investigated further, and we would like to keep pursuing those directions, for the good of content caching at Facebook as well as other modern Web services in general. As a distributed system researcher, I am also interested in other types of large-scale systems, and luckily there are many of them at Facebook.