In this paper, we focus on the problem of content-based retrieval for audio, which aims to retrieve all semantically similar audio recordings for a given audio clip query. We propose a novel approach which encodes the audio into a vector representation using Siamese Neural Networks. The goal is to obtain an encoding similar for files belonging to the same audio class, thus allowing retrieval of semantically similar audio. We used two similarity measures, Cosine similarity and Euclidean distance, to show that our method is effective in retrieving files similar in audio content. Our results indicate that our neural network-based approach is able to retrieve files similar in content and semantics
Citation: Manocha, Pranay, Rohan Badlani, Anurag Kumar, Ankit Shah, Benjamin Elizalde, and Bhiksha Raj. “Content-based Representations of audio using Siamese neural networks.” arXiv preprint arXiv:1710.10974 (2017).