Mapping and Masking Targets Comparison using Different Deep Learning based Speech Enhancement Architectures
Conference paper
Abdallah Abdelhafiz Nossier, S., Wall, J., Moniri, M., Glackin, C. and Cannings, N. 2020. Mapping and Masking Targets Comparison using Different Deep Learning based Speech Enhancement Architectures. 2020 International Joint Conference on Neural Networks (IJCNN). Glasgow, UK 19 - 24 Jul 2020 IEEE. https://doi.org/10.1109/IJCNN48605.2020.9206623
Authors | Abdallah Abdelhafiz Nossier, S., Wall, J., Moniri, M., Glackin, C. and Cannings, N. |
---|---|
Type | Conference paper |
Abstract | Mapping and Masking targets are both widely used in recent Deep Neural Network (DNN) based supervised speech enhancement. Masking targets are proved to have a positive impact on the intelligibility of the output speech, while mapping targets are found, in other studies, to generate speech with better quality. However, most of the studies are based on comparing the two approaches using the Multilayer Perceptron (MLP) architecture only. With the emergence of new architectures that outperform the MLP, a more generalized comparison is needed between mapping and masking approaches. In this paper, a complete comparison will be conducted between mapping and masking targets using four different DNN based speech enhancement architectures, to work out how the performance of the networks changes with the chosen training target. The results show that there is no perfect training target with respect to all the different speech quality evaluation metrics, and that there is a tradeoff between the denoising process and the intelligibility of the output speech. Furthermore, the generalization ability of the networks was evaluated, and it is concluded that the design of the architecture restricts the choice of the training target, because masking targets result in significant performance degradation for deep convolutional autoencoder architecture. |
Keywords | Deep Learning; Speech Enhancement; Training Targets; Time-Frequency Mapping; Time-Frequency Masking |
Year | 2020 |
Conference | 2020 International Joint Conference on Neural Networks (IJCNN) |
Publisher | IEEE |
Accepted author manuscript | License File Access Level Anyone |
Publication dates | |
Online | 28 Sep 2020 |
Publication process dates | |
Accepted | 20 Mar 2020 |
Deposited | 12 Jan 2021 |
ISSN | 2161-4407 |
Book title | 2020 International Joint Conference on Neural Networks (IJCNN) |
ISBN | 978-1-7281-6926-2 |
Digital Object Identifier (DOI) | https://doi.org/10.1109/IJCNN48605.2020.9206623 |
Copyright holder | © 2020 IEEE |
Copyright information | Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. |
https://repository.uel.ac.uk/item/88x24
Download files
Accepted author manuscript
Mapping vs Masking_Final Version-1.pdf | ||
License: All rights reserved | ||
File access level: Anyone |
332
total views763
total downloads11
views this month5
downloads this month