Robust Speech Recognition Using Meta-Learning for Low-Resource Accents

[Conference] National Conference on Communications (NCC 2024), February 2024

Authors:

Dhanya Eledath, Arun Baby, Shatrughan Singh

Abstract:

Robust accented speech recognition is a challenging task in the field of automatic speech recognition (ASR). Accurate recognition of low-resource accents can significantly improve the performance of speech-based systems in various applications such as virtual assistants, communication devices, and language learning tools. However, ASR models often struggle to accurately recognize these accents due to their variability in pronunciation and language use. The state-of-the-art conformer transducer model for ASR is trained with the help of model-agnostic meta-learning to improve the performance of the system across different accents of English in this work. An improvement of about 12 % relative word error rate is achieved using a publicly available dataset for most of the low-resource accents.

Cite:

@INPROCEEDINGS{10485786,
  author={Eledath, Dhanva and Baby, Arun and Singh, Shatrughan},
  booktitle={2024 National Conference on Communications (NCC)}, 
  title={Robust Speech Recognition Using Meta-Learning for Low-Resource Accents}, 
  year={2024},
  volume={},
  number={},
  pages={1-6},
  keywords={Metalearning;Performance evaluation;Transducers;Error analysis;Virtual assistants;Training data;Speech recognition;speech recognition;accented speech recognition;low-resource accents;on-device speech recognition},
  doi={10.1109/NCC60321.2024.10485786}}

NCC

Code:

NA