Deception Detection in Conversations using the Proximity of Linguistic Markers
Article
Bajaj, N., Rajwadi, M., Goodluck Constance, T., Wall, J., Moniri, M., Laird, T., Woodruff, C., Laird, J., Glackin, C. and Cannings, N. 2023. Deception Detection in Conversations using the Proximity of Linguistic Markers. Knowledge-Based Systems. 23 (Art. 110422). https://doi.org/10.1016/j.knosys.2023.110422
Authors | Bajaj, N., Rajwadi, M., Goodluck Constance, T., Wall, J., Moniri, M., Laird, T., Woodruff, C., Laird, J., Glackin, C. and Cannings, N. |
---|---|
Abstract | Detecting the elements of deception in a conversation takes years of study and experience, and it is a skill set primarily used in law-enforcement agencies. In ever-growing business opportunities, organisations employ teleoperators to provide support and services to their large customer base, which is a potential platform for fraud. With technological advancements, it is desirable to have an automated system that spots the deceptive elements in the conversation, and provides this information to the teleoperators to better support them in their interactions. We propose the Decision Engine to detect deceptive conversation based on the proximity of linguistic markers present, which produces a deception score for a conversation and highlights the potential deceptive elements of the conversation. In collaboration with behavioural experts, we have selected ten linguistic markers that potentially indicate deception. We have built a variety of models to detect the trigger terms for selected linguistic markers without ambiguity, using either regular expressions or the BERT model. The BERT model has been trained on a conversational dataset that we collated and was labelled by our behavioural experts. The proposed Decision Engine employs the BERT model and regular expressions to detect the linguistic markers and compute the proximity features to further estimate the deception score. We evaluated the proposed approach on the Columbia-SRI-Colorado (CSC) dataset and a real-world Financial Services dataset. In addition to accuracy, we have also employed the True Positive Rate metric, with a high enough threshold to avoid any false-positive cases, which we indicate as TPRF0. The Decision Engine achieves 69% accuracy and 46% TPRF0 for the CSC dataset and 72% accuracy and 60% TPRF0 for the Financial Services dataset. In contrast, a baseline model, which uses nonproximity features achieves 67% accuracy and 32% TPRF0 for the CSC dataset and 67% accuracy and 10% TPRF0 for the Financial Services dataset. Furthermore, using the Decision Engine, the impact of the proximity of markers on the deception score has been analysed by our behavioural experts to provide insight into linguistic behaviour in relation to deception. |
Keywords | Deception; Conversational Speech; Fraud; Linguistic Markers; Proximity Features; Proximity Model; Decision Engine; BERT; Linguistic Analysis |
Journal | Knowledge-Based Systems |
Journal citation | 23 (Art. 110422) |
ISSN | 0950-7051 |
Year | 2023 |
Publisher | Elsevier |
Accepted author manuscript | License File Access Level Anyone |
Digital Object Identifier (DOI) | https://doi.org/10.1016/j.knosys.2023.110422 |
Publication dates | |
Online | 03 Mar 2023 |
12 May 2023 | |
Publication process dates | |
Accepted | 23 Feb 2023 |
Deposited | 06 Mar 2023 |
Funder | Innovate UK |
Copyright holder | © 2023 The Author(s) |
https://repository.uel.ac.uk/item/8vq9z
Download files
Accepted author manuscript
Deception Detection in Conversations using the Proximity of Linguistic Markers (2023).pdf | ||
License: CC BY-NC-ND 4.0 | ||
File access level: Anyone |
174
total views168
total downloads3
views this month2
downloads this month