### Popularity Prediction for Social Media over Arbitrary Time Horizons

Daniel Haimovich, Dima Karamshuk, Thomas Leeper, Evgeniy Riabenko, Milan Vojnovic

Mathematical Programming

We consider the problem of minimizing composite functions of the form *f *( *g *(*x*) ) +* h *( *x *), where *f* and *h* are convex functions (which can be nonsmooth) and *g* is a smooth vector mapping. In addition, we assume that *g* is the average of finite number of component mappings or the expectation over a family of random component mappings. We propose a class of stochastic variance-reduced prox-linear algorithms for solving such problems and bound their sample complexities for finding an e-stationary point in terms of the total number of evaluations of the component mappings and their Jacobians. When *g* is a finite average of *N* components, we obtain sample complexity *O *(*N* + *N*^{4/5} ε^{-1} ) for both mapping and Jacobian evaluations. When *g* is a general expectation, we obtain sample complexities of *O *(e^{-5/2} ) and *O *(e^{-3/2 }) for component mappings and their Jacobians respectively. If in addition *f* is smooth, then improved sample complexities of *O *( *N* + *N *^{1/2} ε^{-1} ) and *O *( ε^{-3/2} ) are derived for *g* being a finite average and a general expectation respectively, for both component mapping and Jacobian evaluations.

Daniel Haimovich, Dima Karamshuk, Thomas Leeper, Evgeniy Riabenko, Milan Vojnovic

Liqi Yan, Qifan Wang, Yiming Cu, Fuli Feng, Xiaojun Quan, Xiangyu Zhang, Dongfang Liu

Barlas Oğuz, Kushal Lakhotia, Anchit Gupta, Patrick Lewis, Vladimir Karpukhin, Aleksandra Piktus, Xilun Chen, Sebastian Riedel, Wen-tau Yih, Sonal Gupta, Yashar Mehdad