Project: №AP05131207 Development of multilingual automatic speech recognition technology using deep neural networks (2018-2020) – Институт информационных и вычислительных технологий

The aim:

of the project is to improve the accuracy of multilingual speech recognition through the use of artificial neural networks at the stage of acoustic and language modeling.

Research objectives:

Analytical review of the development in the field of speech recognition;
Development and design of speech and text corpus for Russian and Kazakh languages;
Development of acoustic and language models using artificial neural networks;
Development and testing of a multilingual automatic speech recognition system.

The development of multilingual automatic speech recognition technologies and the use of artificial neural networks for deep learning are relevant for solving various production and economic tasks.

The scientific significance of the planned research lies in the development of recognition and deep learning methods. It is planned to conduct a comprehensive study of existing recognition methods and artificial neural networks with the subsequent selection of the most effective methods in relation to multilingual automatic speech recognition. The research will focus on the intellectualization of the recognition process as a whole using deep neural network algorithms, hidden Markov models, speech recognition algorithms.

As a socio-economic effect in the process of project implementation, it is expected to improve the quality and increase the degree of adaptation of modern speech technologies to national languages. As a result, there will be a greater introduction of speech technologies into people’s daily lives, which in turn will lead to an increase in their quality of life (this is especially important for people with disabilities in developing countries).

The ultimate goal of creating multilingual automatic speech recognition using deep neural networks capable of recognizing speech in an acoustic signal with an efficiency no less than that of a similar human ability. In the course of the development of science and technology in the development of a multilingual automatic speech recognition system, significant progress has been observed. The case size has grown to 2000 hours.

Novelty:

Analysis of existing speech recognition systems, as well as the development of mathematical models and algorithms to solve the task of developing multilingual automatic speech recognition technology.

Scope of application:

Government agencies responsible for expanding the scope of national languages based on information technology; mobile phone manufacturers (increasing the number of potential buyers through the introduction of speech technologies in national languages); mobile operators and banks (call centers with voice support, voice authentication); the production sector of various devices with voice support functions (talking books, talking toys, smart home devices).

Implementation:

The results of the project were implemented in the LLP National Innovation Center.

Publications:

Kalimoldayev M., Mamyrbayev O., Mekebayev N., Kydyrbekova A. Algorithms for detection gender using neural networks // International Journal of Circuits, Systems and Signal. – 2020. – № 14. – P. 154 – 159 (Scopus).
Orken Mamyrbayev, Keylan Alimhan, Bagashar Zhumazhanov, Tolganay Turdalykyzy, Farida Gusmanova End-to-End Speech Recognition in Agglutinative Languages // ACIIDS. – 2020. – Р. 391-402 // https://doi.org/10.1007/978-3-030-42058-1_33 (Scopus and Web of Science IF – 0.3, Q4).
Keylan Alimhan, Orken Mamyrbayev, Aigerim Erdenova, Almira Akmetkalyeva Global output tracking by state feedback for high-order nonlinear systems with time-varying delays // Cogent Engineering. – 2020. – № 7 (1711676). – P. 1 – 13 // https://doi.org/10.1080/23311916.2020.1711676 (Scopus, Процентил – 76).
Orken Mamyrbayev, Alymzhan Toleu, Gulmira Tolegen, Nurbapa Mekebayev Neural architectures for gender detection and speaker identification // Cogent Engineering. – 2020. – № 7 (1727168). – P. 1 – 13 // https://doi.org/10.1080/23311916.2020.1727168 (Scopus, Процентиль – 76).
Kydyrbekova Aizat, Othman Mohamed, Mamyrbayev Orken, Akhmediyarova Ainur, Bagashar Zhumazhanov Identification and authentication of user voice using DNN features and i-vector // Cogent Engineering. – 2020. – № 7 (1751557). – P. 1–21 // https://doi.org/10.1080/23311916.2020.1751557 (Scopus, Процентиль – 76).
Orken Zh. Mamyrbayev, Keylan Alimhan, Beibut Amirgaliyev, Bagashar Zhumazhanov, Dinara Mussayeva, Farida Gusmanova Multimodal systems for speech recognition // Int. J. Mobile Communications. -2020. – Vol. 18, № 3. – P. 314 – 326. (Web of Science IF – 1.3, Q3)
М.Н. Қалимолдаев, О.Ж. Мамырбаев, Н.О. Мекебаев, М. Тұрдалыұлы Машиналық оқытуды қолдануда дауыстың гендерлік жіктелінуі // ҚазҰТЗУ хабаршысы. -2019. – № 6 (136). – Б. 229 – 233.
О. Мамырбаев, А. Шаяхметова, А. Кыдырбекова, М. Турдалыулы Интегральный подход распознавания речи для агглютинативных языков // Вестник Алматинского университета энергетики и связи. – 2020. – № 1 (48). – С. 93 – 102.
Мамырбаев О.Ж., Othman M., Ахмедиярова А.Т., Кыдырбекова А.С. Конфиденциальность и безопасность организации от инсайдерских атак, с использованием голосовой биометрики // Научные тенденции: Вопросы точных и технических наук. Сб. научных трудов по матер. XXVIII междунар. науч. конф. – СПб, 2020. – С. 15 – 22.
Mamyrbayev O., Akhmediyarova A., Kydyrbekova A., Short-term voice verification of the i-vector // Матер. науч. конф. ИИВТ МОН РК «Современные проблемы информатики и вычислительных технологий». – Алматы, 2020. – С. 9-13.
Mamyrbayev O., Oralbekova D. Modern trends in the development of speech recognition systems // News of the National academy of sciences of the republic of Kazakhstan. – 2020. – Vol. 4, № 332. – P. 42 – 51 // doi.org/10.32014/2020.2518-1726.64
Mamyrbayev O., Turdalyuly M., Mekebayev N., Alimhan K., Kydyrbekova A., Turdalykyzy T. Automatic Recognition of Kazakh Speech Using Deep Neural Networks // ACIIDS. – 2019. – Р. 465-474. https://doi.org/10.1007/978-3-030-14802-7_40 (Scopus).
Alimhan K., Kalimoldayev M.N., Adamov A.A., Mamyrbayev O., Tasbolatuly N., Smolarz A. Further Results on Output Tracking for a Class of Uncertain High-Order Nonlinear Time-Delay Systems // Przegląd Elektrotechniczny. – 2019. – № 9 (5). – P. 88 – 91.
Mamyrbayev O., Tudalyuly M., Mekebayev N., Mukhsina K., Keylan A., BabaAli B., Nabieva G., Duisenbayeva A., Akhmetov B. Continuous Speech Recognition of Kazakh Language // International Conference on Applied Mathematics, Computational Science and Systems Engineering. – Italy, 2019. – V. 24.
Mamyrbayev O., Mekebayev N., Turdalyuly M., Oshanova N., Medeni T.I., Yessentay A. Voice Identification Using Classification Algorithms // Intelligent System and Computing. IntechOpen, DOI: 10.5772/intechopen.88239.– 2019.
Мамырбаев О.Ж., Мекебаев Н.О., Тұрдалыұлы М., Ахметов И. MFCC негізіндегі дикторды анықтау жүйесі // ҚазҰТЗУ хабаршысы. – 2019. – № 2. – Б. 155-160.
Мамырбаев О.Ж., Кыдырбекова А.С., Тұрдалыұлы М., Мекебаев Н.О. Обзор методов идентификации и аутентификации пользователей по голосу // Матер. науч. конф. ИИВТ КН МОН РК «Инновационные IT и Smart-технологии», посв. 70-летнему юбилею проф. Утепбергенова И.Т. – Алматы, 2019. – Б. 315-321.
Қалимолдаев М.Н., Мамырбаев О.Ж., Мекебаев Н.О., Тұрдалыұлы М. Машиналық оқуды қолдануда дауыстың гендерлік жіктелінуі // Матер. науч. конф. ИИВТ МОН РК «Современные проблемы информатики и вычислительных технологий». – Алматы, 2019. – С. 51-57.
Мамырбаев О.Ж., Тұрдалыұлы М., Мекебаев Н.О., Тұрдалықызы Т., Шаяхметова А.С. Автоматическое распознавание казахской речи с использованием DNN // Вестник КБТУ. – 2019. – № 2 (49). – С. 134-142.
Мамырбаев О.Ж., Кыдырбекова А.С., Ахмедиярова А.Т., Тұрдалыұлы М., Мекебаев Н.О. Систематический обзор и анализ особенностей идентификации по голосу // Вестник КБТУ. – 2019. – № 2 (49). – С. 120-133.
Мамырбаев О.Ж., Тұрдалықызы Т., Тұрдалыұлы М. Сөйлеуді танудың әлі шешілмеген мәселелері // Матер. IV междунар. науч.-практ. конф. «Информатика и прикладная математика», посв. 70-летнему юбилею проф. Биярова Т.Н., В. Вуйцика и 60-летию проф. Амиргалиева Е.Н. – Алматы, 2019. – Б. 91 – 94.
Мамырбаев О.Ж., Тұрдалықызы Т., Тұрдалыұлы М. Идентификация диктора используя MFFC // Матер. IV междунар. науч.-практ. конф. «Информатика и прикладная математика», посв. 70-летнему юбилею проф. Биярова Т.Н., В.Вуйцика и 60-летию проф. Амиргалиева Е.Н. – Алматы, 2019. – Б. 384 – 392.
Bagher BabaAli, Waldemar Wojcik, Oken Mamyrbayev, Mussa Turdalyuly, Nurbapa Mekebayev. Speech Recognizer-Based Non-Uniform Spectral Compression for Robust MFCC Feature Extraction // Przeglad Elektrotechniczny. – 2018. – № 6 (94). – P. 90-93.
Мамырбаев О.Ж. , Мекебаев Н.О., Тұрдалыұлы М. Сөйлеулерді тану үрдісінде MFCC алгоритмін қолдану // ҚазҰТЗУ хабаршысы. – 2018. – № 2 (126). – Б. 389-392.
Мамырбаев О.Ж., Мекебаев Н.О., Тұрдалыұлы М. Генетикалық алгоритм көмегімен сөйлеуді автоматты танудағы гендерлік сәйкестендіру // Алматы энергетика және байланыс университетінің хабаршысы. – 2018. – спец. вып. – Б. 120-129.
Мамырбаев О.Ж., Тұрдалыұлы М., Мекебаев Н.О. Система распознавания слитной казахской речи на основе глубоких нейронных сетей // Вестник Алматинского университета энергетики и связи. – 2018. – спец. вып. – С. 130-135.
Мамырбаев О.Ж., Турдалыулы М., Мекебаев Н.О., Алимхан К., Набиева Г.С., Мамырбаев Б.Ж. Фонетически представительный текст для создания систем автоматического распознавания казахской речи // Наука и Мир. – 2018. – Т. 2, № 6 (58). – С. 49-52.

Copyright certificates:

Copyright certificates № 1425 System of automatic creation vocabulary for ASR / Mamyrbayev O.Zh., Turdalyuly M., Mekebayev N.O., Seitkali B.N., Duysenbayeva A.Zh. 22.01.2019.
Copyright certificates № 7844. MultiSpeech recognition. Mamyrbayev O. Zh., Turdalyuly M., Turdalyevna T., Kydyrbekova A. S., Mekebayev N. O., Seitkali B., Akhmetov B. S. 2020..

Monographs:

Мамырбаев Ө.Ж. Қазақ ауызекі сөйлеуін автоматты өңдеу: Монография. – ҚР БҒМ ҒК Ақпараттық және есептеуіш технологиялар институты. – 2020. – 142 б.

Books:

1 Мамырбаев О.Ж., Кыдырбекова А.С., Тұрдалыұлы М., Мекебаев Н.О. Методы и модели автоматического распознавания речи. – Институт информационных и вычислительных технологий КН МОН РК. – 2020. – 210 с.

2 Мамырбаев О.Ж., Кыдырбекова А.С., Тұрдалыұлы М., Жумажанов Б.Ж., Мекебаев Н.О. Автоматическое распознавание речи. – Институт информационных и вычислительных технологий КН МОН РК. – 2020. – 104 с.

The results obtained:

– multilingual corpus of Kazakh and Russian languages;

– methods of preprocessing speech signals, acoustic and language models, automatic transcriber;

– multilingual automatic speech recognition system.