Optimization of multilayer neural network parameters for speaker recognition

Jaromir Tovarek; Pavol Partila; Jan Rozhon; Miroslav Voznak; Jan Skapa; Dominik Uhrin; Zdenka Chmelikova

doi:10.1117/12.2223545

12 May 2016 Optimization of multilayer neural network parameters for speaker recognition

Jaromir Tovarek, Pavol Partila, Jan Rozhon, Miroslav Voznak, Jan Skapa, Dominik Uhrin, Zdenka Chmelikova

Proceedings Volume 9850, Machine Intelligence and Bio-inspired Computation: Theory and Applications X; 98500C (2016) https://doi.org/10.1117/12.2223545
Event: SPIE Defense + Security, 2016, Baltimore, MD, United States

Abstract

This article discusses the impact of multilayer neural network parameters for speaker identification. The main task of speaker identification is to find a specific person in the known set of speakers. It means that the voice of an unknown speaker (wanted person) belongs to a group of reference speakers from the voice database. One of the requests was to develop the text-independent system, which means to classify wanted person regardless of content and language. Multilayer neural network has been used for speaker identification in this research. Artificial neural network (ANN) needs to set parameters like activation function of neurons, steepness of activation functions, learning rate, the maximum number of iterations and a number of neurons in the hidden and output layers. ANN accuracy and validation time are directly influenced by the parameter settings. Different roles require different settings. Identification accuracy and ANN validation time were evaluated with the same input data but different parameter settings. The goal was to find parameters for the neural network with the highest precision and shortest validation time. Input data of neural networks are a Mel-frequency cepstral coefficients (MFCC). These parameters describe the properties of the vocal tract. Audio samples were recorded for all speakers in a laboratory environment. Training, testing and validation data set were split into 70, 15 and 15 %. The result of the research described in this article is different parameter setting for the multilayer neural network for four speakers.

Citation Download Citation

Jaromir Tovarek, Pavol Partila, Jan Rozhon, Miroslav Voznak, Jan Skapa, Dominik Uhrin, and Zdenka Chmelikova "Optimization of multilayer neural network parameters for speaker recognition", Proc. SPIE 9850, Machine Intelligence and Bio-inspired Computation: Theory and Applications X, 98500C (12 May 2016); https://doi.org/10.1117/12.2223545

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available

Members: $17.00

Non-members: $21.00 ADD TO CART

PROCEEDINGS
6 PAGES

DOWNLOAD PAPER SAVE TO MY LIBRARY

GET CITATION

RIGHTS & PERMISSIONS

Get copyright permission Get copyright permission on Copyright Marketplace

KEYWORDS

Neural networks

Neurons

Speaker recognition

Biometrics

Signal processing

Databases

Forensic science

Show All Keywords

Keywords/Phrases

Search In:

Publication Years