Enhancing Automatic Speech Recognition Quality with a Second-Stage Speech Enhancement Generative Adversarial Network
Conference paper
Nossier, S. A., Wall, J., Moniri, M., Glackin, C. and Cannings, N. 2023. Enhancing Automatic Speech Recognition Quality with a Second-Stage Speech Enhancement Generative Adversarial Network. The 35th IEEE International Conference on Tools with Artificial Intelligence (ICTAI). Atlanta, Georgia (USA) 06 - 08 Nov 2023 IEEE Computer Society. https://doi.org/10.1109/ICTAI59109.2023.00087
Authors | Nossier, S. A., Wall, J., Moniri, M., Glackin, C. and Cannings, N. |
---|---|
Type | Conference paper |
Abstract | Speech enhancement is an essential preprocessing stage for automatic speech recognition in noisy conditions; however, the distortion caused by the denoising process may lead to degradation in automatic speech recognition performance. This paper presents a deep learning-based speech enhancement architecture to overcome this issue by applying a second stage network that deals with distortion noise. Moreover, a signal-to-noise ratio binary classifier is implemented to activate the speech enhancement network for intrusive noise environments only, which improves the overall performance. The proposed architecture outperforms powerful models in the literature, as it improves a challenging noisy speech test set by 0.8 and 5.9% improvement in the quality and intelligibility scores, respectively. Furthermore, the architecture improves the performance of automatic speech recognition with a 13.8% reduction in the word error rate at 0 dB signal-to-noise ratio. Finally, the second-stage network was proven to improve the performance of first-stage speech enhancement models, not previously seen in the training process. |
Keywords | automatic speech recognition; deep learning; generative adversarial network; speech distortion; speech enhancement |
Year | 2023 |
Conference | The 35th IEEE International Conference on Tools with Artificial Intelligence (ICTAI) |
Publisher | IEEE Computer Society |
Accepted author manuscript | License File Access Level Anyone |
Publication dates | |
2023 | |
Publication process dates | |
Accepted | 04 Sep 2023 |
Deposited | 18 Sep 2023 |
Journal citation | pp. 546-552 |
ISSN | 2375-0197 |
Book title | 2023 IEEE 35th International Conference on Tools with Artificial Intelligence Proceedings |
ISBN | 9798350342734 |
Digital Object Identifier (DOI) | https://doi.org/10.1109/ICTAI59109.2023.00087 |
Web address (URL) of conference proceedings | https://www.computer.org/csdl/proceedings/ictai/2023/1T3d5DsZCfe |
Copyright holder | © 2023, IEEE |
Copyright information | Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. |
https://repository.uel.ac.uk/item/8w8z5
Download files
Accepted author manuscript
Nossier_ICTAI_317.pdf | ||
License: All rights reserved | ||
File access level: Anyone |
132
total views63
total downloads1
views this month3
downloads this month