Robust Deepfake Speech Algorithm Recognition: Classifying Generative Algorithms via Speaker X-Vectors and Deep Learning
Conference paper
Maltby, H., Wall, J., Glackin, C., Moniri, M., Shrestha, R., Cannings, N. and Salami, I. 2025. Robust Deepfake Speech Algorithm Recognition: Classifying Generative Algorithms via Speaker X-Vectors and Deep Learning. IEEE International Joint Conference on Neural Networks (IJCNN). Vancouver, Canada 24 - 29 Jul 2016 IEEE.
Authors | Maltby, H., Wall, J., Glackin, C., Moniri, M., Shrestha, R., Cannings, N. and Salami, I. |
---|---|
Type | Conference paper |
Abstract | The rapid advancement of deepfake voice technologies has resulted in alarming cases of impersonation and deception, highlighting the urgent need for robust tools that can not only distinguish real audio from fake but also recognise the generative algorithms responsible. The ability to not only detect deepfake audio but also recognise the generative methods used is essential for forensic investigations, legal proceedings, and regulatory enforcement. Without robust and explainable detection frameworks, legal professionals and investigators lack the tools needed to effectively monitor, investigate, and prosecute cases involving deepfake misuse. In this work, we take a voice biometrics approach, shifting the focus from identifying who is speaking to identifying which algorithm is speaking. Doing so allows our approach to inherently handle unseen classes while achieving competitive performance for deepfake speech algorithm recognition. Our system leverages a voice-focused ResNet101-based x-vector extraction model and combines diverse audio features, and our experimental novel feature LFCC-HF, enhanced with Linear Discriminant Analysis and cosine similarity clustering. This approach allows for a more transparent and interpretable decision-making process by usinga single voice similarity decision boundary compared to the ensemble-based methods commonly used in the literature. Unlike previous works that rely on an ensemble of models, which convolute the decision-making process, our method achieves comparable results while using a significantly lighter-weight architecture, with our model having 14.84 M parameters compared to 95 M and 317 M parameters for Wav2Vec2 base and large. Furthermore, we demonstrate the benefits of targeted data augmentation, which, combined with feature fusion and our novel feature, improves system robustness and adaptability, increasing our F1 Score from 0.624 to 0.763, a 22.275\% increase over our best single feature, and a 40.775\% increase over the best ADD 2023 Track 3 baseline. Importantly, the system achieves interpretability through its back-end classification process, where decisions are based on a transparent, learned threshold for voice similarity to known voiceprints. This work offers a foundation for advancing more robust and interpretable solutions in the field of deepfake speech detection. |
Year | 2025 |
Conference | IEEE International Joint Conference on Neural Networks (IJCNN) |
Publisher | IEEE |
Accepted author manuscript | License File Access Level Anyone |
Publication dates | |
Online | 30 Jun 2025 |
Publication process dates | |
Submitted | 30 Jan 2025 |
Accepted | 01 Apr 2025 |
Deposited | 12 Jun 2025 |
Journal citation | p. In press |
ISSN | 2161-4407 |
2161-4393 | |
Web address (URL) of conference proceedings | https://ieeexplore.ieee.org/xpl/conhome/1000500/all-proceedings |
Copyright holder | © 2025 IEEE |
Additional information | Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. |
https://repository.uel.ac.uk/item/8zq1v
Download files
Accepted author manuscript
ijcnn_latex_template_2025_final_camera_ready.pdf | ||
License: All rights reserved | ||
File access level: Anyone |
50
total views10
total downloads3
views this month1
downloads this month