A Method for Animating Children’s Drawings of the Human Figure
Harrison Jesse Smith, Qingyuan Zheng, Yifei Li, Somya Jain, Jessica K. Hodgins
International Joint Conference on Neural Network (IJCNN)
Access to large corpora with strongly labelled sound events is expensive and difficult in engineering applications. Many researches turn to address the problem of how to detect both the types and the timestamps of sound events with weak labels that only specify the types. This task can be treated as a multiple instance learning (MIL) problem, and a key to it in the sound event detection (SED) task is the design of a pooling function. The linear softmax pooling function achieves state-of-the-art performance since it can vary both the signs and the magnitudes of gradients. However, linear softmax pooling cannot flexibly deal with sound events of different time scales. In this paper, we propose a power pooling function which can automatically adapt to various sound events. By adding a trainable parameter to each event, power pooling can provide more accurate gradients for frames in a clip than other pooling functions. On both weakly supervised and semi-supervised SED datasets, the proposed power pooling function outperforms linear softmax pooling on both coarse-grained and fine-grained metrics. Specifically, it improves the event-based F1 score by 11.4% and 10.2% relatively on the two datasets. While this paper focuses on SED applications, the proposed method can be applied to MIL tasks in other domains.
Harrison Jesse Smith, Qingyuan Zheng, Yifei Li, Somya Jain, Jessica K. Hodgins
Yunbo Zhang, Deepak Gopinath, Yuting Ye, Jessica Hodgins, Greg Turk, Jungdam Won
Simran Arora, Patrick Lewis, Angela Fan, Jacob Kahn, Christopher Ré