Learning from Weak Labels
Date:
One of the key bottlenecks in training diverse accurate audio classifiers is the need for “strongly-labeled” training data, that provide precisely demarcated instances of the audio events to be recognized. Such data are, however, difficult to obtain, particularly in bulk. The alternate, more popular approach is to train models using “weakly” labelled data, comprising recordings in which only the presence or absence of sound classes is tagged, without additional details of the number of occurrences of the sounds or their locations in the recordings. Weakly labelled data are much easier to obtain than strongly labelled data; however training with such data comes with many challenges. In this tutorial we will discuss the problem of training audio (and other) classifiers from weakly labelled data, including several state-of-art formalisms, their restrictions and limitations, and areas of future research.