Spectro-Temporal analysis using local binary pattern variants for acoustic scene classification
Abidin, S., Togneri, R. and Sohel, F. (2018) Spectro-Temporal analysis using local binary pattern variants for acoustic scene classification. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 26 (11). pp. 2112-2121.
*Subscription may be required
Abstract
In this paper we present an approach for acoustic scene classification, which aggregates spectral and temporal features. We do this by proposing the first use of the variable-Q transform (VQT) to generate the time-frequency representation for acoustic scene classification. The VQT provides finer control over the resolution compared to the constant-Q transform (CQT) or STFT and can be tuned to better capture acoustic scene information. We then adopt a variant of the local binary pattern (LBP), the Adjacent Evaluation Completed LBP (AECLBP), which is better suited to extracting features from acoustic time-frequency images. Our results yield a 5.2% improvement on the DCASE 2016 dataset compared to the application of standard CQT with LBP. Fusing our proposed AECLBP with HOG features we achieve a classification accuracy of 85.5% which outperforms one of the top performing systems.
Item Type: | Journal Article |
---|---|
Murdoch Affiliation(s): | School of Engineering and Information Technology |
Publisher: | IEEE |
Copyright: | © 2018 IEEE |
URI: | http://researchrepository.murdoch.edu.au/id/eprint/41629 |
![]() |
Item Control Page |