Murdoch University Research Repository

Welcome to the Murdoch University Research Repository

The Murdoch University Research Repository is an open access digital collection of research
created by Murdoch University staff, researchers and postgraduate students.

Learn more

Deep Boltzmann machines for i-Vector based audio-visual person identification

Alam, M., Bennamoun, M., Togneri, R. and Sohel, F. (2015) Deep Boltzmann machines for i-Vector based audio-visual person identification. Lecture Notes in Computer Science, 9431 . pp. 631-641.

Link to Published Version:
*Subscription may be required


We propose an approach using DBM-DNNs for i-vector based audio-visual person identification. The unsupervised training of two Deep Boltzmann Machines DBMspeech and DBMface is performed using unlabeled audio and visual data from a set of background subjects. The DBMs are then used to initialize two corresponding DNNs for classification, referred to as the DBM-DNNspeech and DBM-DNNface in this paper. The DBM-DNNs are discriminatively fine-tuned using the back-propagation on a set of training data and evaluated on a set of test data from the target subjects. We compared their performance with the cosine distance (cosDist) and the state-of-the-art DBN-DNN classifier. We also tested three different configurations of the DBM-DNNs. We show that DBM-DNNs with two hidden layers and 800 units in each hidden layer achieved best identification performance for 400 dimensional i-vectors as input. Our experiments were carried out on the challenging MOBIO dataset.

Item Type: Journal Article
Murdoch Affiliation(s): School of Engineering and Information Technology
Publisher: Springer Verlag
Copyright: 2016 Springer International Publishing Switzerland
Conference Website:
Other Information: Book Title: Image and Video Technology: 7th Pacific Rim Symposium on Image and Video Technology (PSIVT) 2015 Auckland, New Zealand 23 - 27 November 2015 Revised Selected Papers
Item Control Page Item Control Page