Enhancing Automatic Speech Recognition Quality with a Second-Stage Speech Enhancement Generative Adversarial Network
Nossier, S. A., Wall, J., Moniri, M., Glackin, C. and Cannings, N. 2023. Enhancing Automatic Speech Recognition Quality with a Second-Stage Speech Enhancement Generative Adversarial Network. The 35th IEEE International Conference on Tools with Artificial Intelligence (ICTAI). Atlanta, Georgia (USA) 06 - 08 Nov 2023 IEEE Computer Society. https://doi.org/10.1109/ICTAI59109.2023.00087
|Nossier, S. A., Wall, J., Moniri, M., Glackin, C. and Cannings, N.
Speech enhancement is an essential preprocessing stage for automatic speech recognition in noisy conditions; however, the distortion caused by the denoising process may lead to degradation in automatic speech recognition performance. This paper presents a deep learning-based speech enhancement architecture to overcome this issue by applying a second stage network that deals with distortion noise. Moreover, a signal-to-noise ratio binary classifier is implemented to activate the speech enhancement network for intrusive noise environments only, which improves the overall performance. The proposed architecture outperforms powerful models in the literature, as it improves a challenging noisy speech test set by 0.8 and 5.9% improvement in the quality and intelligibility scores, respectively. Furthermore, the architecture improves the performance of automatic speech recognition with a 13.8% reduction in the word error rate at 0 dB signal-to-noise ratio. Finally, the second-stage network was proven to improve the performance of first-stage speech enhancement models, not previously seen in the training process.
|automatic speech recognition; deep learning; generative adversarial network; speech distortion; speech enhancement
|The 35th IEEE International Conference on Tools with Artificial Intelligence (ICTAI)
|IEEE Computer Society
|Accepted author manuscript
File Access Level
|Publication process dates
|04 Sep 2023
|18 Sep 2023
|2023 IEEE 35th International Conference on Tools with Artificial Intelligence Proceedings
|Digital Object Identifier (DOI)
|Web address (URL) of conference proceedings
|© 2023, IEEE
|Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
7views this month
0downloads this month