Voice authentication module using mel-cepstral coefficients
https://doi.org/10.21822/2073-6185-2024-51-2-77-82
Abstract
Objective. The purpose of the study is to develop and apply a method for extracting information about the identity of users from recordings of their voices using the calculation of mel-cepstral coefficients.
Method. In the study of the application of methods for extracting informative features from a voice recording, allowing identification of the speaker, an authentication scheme using mel-cepstral coefficients is presented.
Result. Based on this method, an authentication module was implemented using audio recordings of user voices using the simplest MFCC. The authentication module was developed using Python language
Conclusion. The biometric authentication method is an inexpensive and relatively simple way to verify the authenticity of users. Despite the obvious advantages of mel-cepstral coefficients, this method has certain disadvantages. To eliminate shortcomings, various frequency filters can be used, as well as third-party algorithms for analyzing audio recordings.
About the Authors
D. A. ElizarovRussian Federation
Dmitriy A. Elizarov, Cand. Sci. (Eng.), Assoc. Prof., Department of Information security
35 Marx Ave., Omsk 644046
P. A. Ashaeva
Russian Federation
Polina A. Ashaeva, Postgraduate Student, Department «Information security»
35 Marx Ave., Omsk 644046
E. A. Stepanova
Russian Federation
Elizaveta A. Stepanova, Cand. Sci. (Eng.), Assoc. Prof., Department «Information security»
35 Marx Ave., Omsk 644046
References
1. Jain A., Hong L., Pankanti S. Biometric identification. Communications of the ACM. 2000; 43( 2): 90-98.
2. Voice Biometrics market size and share analysis - Industry Research Report - Growth Trends. URL: https://www.mordorintelligence.com/ru/industry-reports/voice-biometrics-market (accessed date: 11.12.2023). (In Russ)
3. Biometrics Manual / R. M. Ball, J. H. Connell, S. Pancanti, N. K. Ratha, E. U. Senior. M.: Technosphere, 2007; 368 .
4. Stevens S. S., Volkmann J., Newman E. B. A scale for the measurement of the psychological magnitude pitch. The journal of the acoustic society of America. 1937; 8(3):185-190.
5. Gonorovskiy I. S. Radio engineering circuits and signals: textbook for universities. M.: "Soviet radio", 1986;512. (In Russ)
6. Sudyenkova A.V. Review of methods for extracting acoustic signs of speech in the speaker recognition problem. Collection of scientific papers of Novosibirsk State Technical University. 2019; 3-4:139-164. (In Russ)
7. Alim S. A., Rashid N. K. A. Some commonly used speech feature extraction algorithms. IntechOpen. 2018; 2-19.
8. Allen J. B., Rabiner L. R. A unified approach to short-time Fourier analysis and synthesis. Proceedings of the IEEE. 1977; 65(11):1558-1564.
9. Charan R., Manisha A., Karthik R., Rajesh K. M. A text-independent speaker verification model: A comparative analysis. 2017 International Conference on Intelligent Computing and Control (I2C2). IEEE. 2017; 1-6.
10. Misra S. et al. Comparison of MFCC and LPCC for a fixed phrase speaker verification system, time complexity and failure analysis. 2015 International Conference on Circuits, Power and Computing Technologies [ICCPCT-2015]. IEEE, 2015; 1-4.
11. SpeechRecognition-3.8.1. URL: https://pypi.org/project/SpeechRecognition / (accessed date: 10.08.2023).
12. Swaroop C. H. A byte of python. – Independent, 2013.
13. Pythonnet – Python.NET URL: https://github.com/pythonnet/pythonnet (accessed date: 10.08.2023).
14. PyAudio-0.2.11. URL: https://pypi.org/project/PyAudio / (accessed date: 10.08.2023).
15. Random-Word-1.0.7. URL: https://pypi.org/project/Random-Word / (accessed date: 10.08.2023).
Review
For citations:
Elizarov D.A., Ashaeva P.A., Stepanova E.A. Voice authentication module using mel-cepstral coefficients. Herald of Dagestan State Technical University. Technical Sciences. 2024;51(2):77-82. (In Russ.) https://doi.org/10.21822/2073-6185-2024-51-2-77-82