Deep Learning-based Speech Enhancement for Real-life Applications

PhD Thesis

Abdallah Abdelhafiz Nossier, S. 2023. Deep Learning-based Speech Enhancement for Real-life Applications. PhD Thesis University of East London School of Architecture, Computing and Engineering https://doi.org/10.15123/uel.8wv3q

Publication dates
Authors	Abdallah Abdelhafiz Nossier, S.
Type	PhD Thesis
Abstract	Speech enhancement is the process of improving speech quality and intelligibility by suppressing noise. Inspired by the outstanding performance of the deep learning approach for speech enhancement, this thesis aims to add to this research area through the following contributions. The thesis presents an experimental analysis of different deep neural networks for speech enhancement, to compare their performance and investigate factors and approaches that improve the performance. The outcomes of this analysis facilitate the development of better speech enhancement networks in this work. Moreover, this thesis proposes a new deep convolutional denoising autoencoderbased speech enhancement architecture, in which strided and dilated convolutions were applied to improve the performance while keeping network complexity to a minimum. Furthermore, a two-stage speech enhancement approach is proposed that reduces distortion, by performing a speech denoising first stage in the frequency domain, followed by a second speech reconstruction stage in the time domain. This approach was proven to reduce speech distortion, leading to better overall quality of the processed speech in comparison to state-of-the-art speech enhancement models. Finally, the work presents two deep neural network speech enhancement architectures for hearing aids and automatic speech recognition, as two real-world speech enhancement applications. A smart speech enhancement architecture was proposed for hearing aids, which is an integrated hearing aid and alert system. This architecture enhances both speech and important emergency noise, and only eliminates undesired noise. The results show that this idea is applicable to improve the performance of hearing aids. On the other hand, the architecture proposed for automatic speech recognition solves the mismatch issue between speech enhancement automatic speech recognition systems, leading to significant reduction in the word error rate of a baseline automatic speech recognition system, provided by Intelligent Voice for research purposes. In conclusion, the results presented in this thesis show promising performance for the proposed architectures for real time speech enhancement applications.
Keywords	automatic speech recognition; deep learning; hearing aids; speech distortion; speech enhancement
Year	2023
Publisher	University of East London
Digital Object Identifier (DOI)	https://doi.org/10.15123/uel.8wv3q
File	2023_PhD_Abdallah Abdelhafiz Nossier.pdf License CC BY-NC-ND 4.0 File Access Level Anyone
Online	05 Jul 2024
Publication process dates
Completed	04 Oct 2023
Deposited	05 Jul 2024
Copyright holder	© 2023, The Author

Permalink -

https://repository.uel.ac.uk/item/8wv3q

Download files

File

	2023_PhD_Abdallah Abdelhafiz Nossier.pdf
License: CC BY-NC-ND 4.0
File access level: Anyone

423
total views
1458
total downloads
11
views this month
17
downloads this month

Export as

Related outputs

Enhancing Automatic Speech Recognition Quality with a Second-Stage Speech Enhancement Generative Adversarial Network

Nossier, S. A., Wall, J., Moniri, M., Glackin, C. and Cannings, N. 2023. Enhancing Automatic Speech Recognition Quality with a Second-Stage Speech Enhancement Generative Adversarial Network. The 35th IEEE International Conference on Tools with Artificial Intelligence (ICTAI). Atlanta, Georgia (USA) 06 - 08 Nov 2023 IEEE Computer Society. https://doi.org/10.1109/ICTAI59109.2023.00087

A Deep Learning Speech Enhancement Architecture Optimised for Speech Recognition and Hearing Aids

Nossier, S. A., Wall, J., Moniri, M., Glackin, C. and Cannings, N. 2023. A Deep Learning Speech Enhancement Architecture Optimised for Speech Recognition and Hearing Aids. The 35th IEEE International Conference on Tools with Artificial Intelligence (ICTAI). Atlanta, Georgia (USA) 06 - 08 Nov 2023 IEEE Computer Society. https://doi.org/10.1109/ICTAI59109.2023.00088

Convolutional Recurrent Smart Speech Enhancement Architecture for Hearing Aids

Abdallah Abdelhafiz Nossier, S., Wall, J., Moniri, M., Glackin, C. and Cannings, N. 2022. Convolutional Recurrent Smart Speech Enhancement Architecture for Hearing Aids. INTERSPEECH 2022. Incheon, Korea 18 - 22 Sep 2022

Two-Stage Deep Learning Approach for Speech Enhancement and Reconstruction in The Frequency and Time Domains

Nossier, S. A., Wall, J., Moniri, M., Glackin, C. and Cannings, N. 2022. Two-Stage Deep Learning Approach for Speech Enhancement and Reconstruction in The Frequency and Time Domains. WCCI 2022: IEEE World Congress on Computational Intelligence. Padua, Italy 23 May - 18 Jul 2022 IEEE. https://doi.org/10.1109/IJCNN55064.2022.9892355

An Experimental Analysis of Deep Learning Architectures for Supervised Speech Enhancement

Nossier, S. A., Wall, J., Moniri, M., Glackin, C. and Cannings, N. 2020. An Experimental Analysis of Deep Learning Architectures for Supervised Speech Enhancement. Electronics. 10 (Art. 17). https://doi.org/10.3390/electronics10010017

Mapping and Masking Targets Comparison using Different Deep Learning based Speech Enhancement Architectures

Abdallah Abdelhafiz Nossier, S., Wall, J., Moniri, M., Glackin, C. and Cannings, N. 2020. Mapping and Masking Targets Comparison using Different Deep Learning based Speech Enhancement Architectures. 2020 International Joint Conference on Neural Networks (IJCNN). Glasgow, UK 19 - 24 Jul 2020 IEEE. https://doi.org/10.1109/IJCNN48605.2020.9206623

A Comparative Study of Time and Frequency Domain Approaches to Deep Learning based Speech Enhancement

Abdallah Abdelhafiz Nossier, S., Wall, J., Moniri, M., Glackin, C. and Cannings, N. 2020. A Comparative Study of Time and Frequency Domain Approaches to Deep Learning based Speech Enhancement. 2020 International Joint Conference on Neural Networks (IJCNN). Glasgow, UK 19 - 24 Jul 2020 IEEE. https://doi.org/10.1109/IJCNN48605.2020.9206928