Murdoch University Research Repository

Welcome to the Murdoch University Research Repository

The Murdoch University Research Repository is an open access digital collection of research
created by Murdoch University staff, researchers and postgraduate students.

Learn more

Sound event detection using multiple optimized kernels

Xia, X., Togneri, R., Sohel, F., Zhao, Y. and Huang, D. (2020) Sound event detection using multiple optimized kernels. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 28 . pp. 1745-1754.

Link to Published Version:
*Subscription may be required


Sound event detection (SED) has been widely applied in real world applications. Convolutional recurrent neural network based SED approaches have achieved state-of-the-art performance. However, the convolution process is typically performed by using a fixed sized kernel, which adversely affects the detection accuracy especially when the acoustic features of different event classes are characterized by high variations. To deal with this, this article proposes a sound event detection technique using a convolutional recurrent neural network framework with multiple convolutional kernels of different sizes. The top performing kernels are selected from a kernel pool based on the unsupervised clustering errors and the accuracies of the temporarily trained models. Afterwards, the selected kernels are fed to multiple convolution layers to deal with the acoustic feature variations. Experimental results on different subsets of AudioSet, namely the DCASE Challenge 2017 Task 4 and DCASE Challenge 2018 Task 4, demonstrate the performance of the proposed approach compared to state-of-the-art systems.

Item Type: Journal Article
Murdoch Affiliation(s): Information Technology, Mathematics and Statistics
Publisher: IEEE
Copyright: © 2020 IEEE
Item Control Page Item Control Page