skip to content

An improved detection and classification method for mouse ultrasonic vocalizations

Reyhaneh Abbasia,b,c, Peter Balazsa, Maria Adelaide Marconib, Doris Nicolakisb, Sarah M. Zalab, and Dustin J. Pennb

a Acoustic Research Institute, Austrian Academy of Sciences, Vienna, Austria
b Konrad Lorenz Institute of Ethology, Department of Interdisciplinary Life Sciences, University of Veterinary Medicine, Vienna, Austria
c Vienna Doctoral School of Cognition, Behaviour and Neuroscience, University of Vienna, Vienna, Austria

House mice and other rodents emit complex ultrasonic vocalizations (USVs) to communicate in various contexts including social and sexual interactions. These vocalizations are increasingly investigated in research on animal communication and as a phenotype for studying the genetic basis of autism and speech disorders. Rodents emit USVs in discrete units called syllables or calls. USV syllables are separated by gaps of silence and they have been classified into several different categories by researchers visually inspecting spectrograms. Because manual methods for analyzing USVs are extremely time-consuming, several methods have been recently developed for automatically detecting and classifying USVs. Here we evaluate their advantages and disadvantages in a systematic comparison, while also presenting a new approach. This study aims to 1) determine the most efficient USV detection tool among the existing methods, and 2) develop a classification model that is more generalizable than existing methods. We compared the performance of four detection methods in an out-of-the-box approach, pretrained DeepSqueak detector, MUPET, USVSEG, and the Automatic Mouse Ultrasound Detector (A-MUD). A-MUD outperformed the other methods in terms of true positive rates and false detection rates. For automating the classification of USVs, we developed BootSnap for supervised classification, which combines bootstrapping on Gammatone Spectrograms and Convolutional Neural Networks algorithms with Snapshot ensemble learning. It successfully classified calls into 12 types, including a new class of false positives that is useful for detection refinement. BootSnap outperformed the pretrained and retrained state-of-the-art tool, and thus it is more generalizable [1].

[1] Abbasi R, Balazs P, Marconi MA, Nicolakis D, Zala SM, Penn DJ. Capturing the songs of mice with an improved detection and classification method for ultrasonic vocalizations (BootSnap). PLoS Computational Biology. 2022 May 12;18(5):e1010049