Two-Stage Deep Learning Approach for Speech Enhancement and Reconstruction in The Frequency and Time Domains
Conference paper
Nossier, S. A., Wall, J., Moniri, M., Glackin, C. and Cannings, N. 2022. Two-Stage Deep Learning Approach for Speech Enhancement and Reconstruction in The Frequency and Time Domains. WCCI 2022: IEEE World Congress on Computational Intelligence. Padua, Italy 23 May - 18 Jul 2022 IEEE. https://doi.org/10.1109/IJCNN55064.2022.9892355
Authors | Nossier, S. A., Wall, J., Moniri, M., Glackin, C. and Cannings, N. |
---|---|
Type | Conference paper |
Abstract | Deep learning has recently shown promising improvement in the speech enhancement field, due to its effectiveness in eliminating noise. However, a drawback of the denoising process is the introduction of speech distortion, which negatively affects speech quality and intelligibility. In this work, we propose a deep convolutional denoising autoencoder-based speech enhancement network that is designed to have an encoder deeper than the decoder, to improve performance and decrease complexity. Furthermore, we present a two-stage learning approach, in which denoising is performed in the first frequency domain stage using magnitude spectrum as a training target; while, in the second stage, further denoising and speech reconstruction are performed in the time domain. Results show that our architecture achieves 0.22 improvement in the overall predicted mean opinion score (Covl) over state of the art speech enhancement architectures, using the Valentini dataset benchmark. Moreover, the architecture was trained using a larger dataset and tested using a mismatched test corpus, to achieve 0.7 and 6.35% improvement in Perceptual Evaluation of Speech Quality (PESQ) and Short Time Objective Intelligibility (STOI) scores, respectively, compared to the noisy speech. |
Keywords | Deep learning; denoising autoencoders; speech enhancement; speech features; speech reconstruction |
Year | 2022 |
Conference | WCCI 2022: IEEE World Congress on Computational Intelligence |
Publisher | IEEE |
Accepted author manuscript | License File Access Level Anyone |
Publication dates | |
Online | 30 Sep 2022 |
Publication process dates | |
Accepted | 26 Apr 2022 |
Deposited | 17 May 2022 |
ISSN | 2161-4407 |
Book title | 2022 International Joint Conference on Neural Networks (IJCNN) |
ISBN | 978-1-7281-8671-9 |
Digital Object Identifier (DOI) | https://doi.org/10.1109/IJCNN55064.2022.9892355 |
Copyright holder | © 2022 IEEE |
Copyright information | Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. |
https://repository.uel.ac.uk/item/8qq14
Download files
Accepted author manuscript
SANossier - accepted.pdf | ||
License: All rights reserved | ||
File access level: Anyone |
440
total views303
total downloads12
views this month6
downloads this month