Smart Transcription

Conference paper


Wall, J., Glackin, C., Dugan, N. and Cannings, N. 2019. Smart Transcription. 31st European Conference on Cognitive Ergonomics. Belfast, UK 10 - 13 Sep 2019 Association for Computing Machinery (ACM). https://doi.org/10.1145/3335082.3335114
AuthorsWall, J., Glackin, C., Dugan, N. and Cannings, N.
TypeConference paper
Abstract

The Intelligent Voice Smart Transcript is an interactive HTML5 document that contains the audio, a speech transcription and the key topics from an audio recording. It is designed to enable a quick and efficient review of audio communications by encapsulating the recording with the speech transcript and topics within a single HTML5 file. This paper outlines the rationale for the design of the SmartTranscript user experience. The paper discusses the difficulties of audio review, how there is large potential for misinterpretation associated with reviewing transcripts in isolation, and how additional diarization and topic tagging components augment the audio review process.

KeywordsUser interface design; speech recognition; speech and audio search; topic modelling
Year2019
Conference31st European Conference on Cognitive Ergonomics
PublisherAssociation for Computing Machinery (ACM)
Accepted author manuscript
License
File Access Level
Anyone
Publication dates
Print10 Sep 2019
Publication process dates
Deposited27 Jan 2020
Journal citationpp. 134-137
Book titleECCE 2019: Proceedings of the 31st European Conference on Cognitive Ergonomics
Book editorMulvenna, M.
Bond, R.
ISBN978-1-4503-7166-7
Digital Object Identifier (DOI)https://doi.org/10.1145/3335082.3335114
Web address (URL) of conference proceedingshttps://doi.org/10.1145/3335082.3335114
Copyright holder© ACM 2019
Copyright informationThis is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in ECCE 2019: Proceedings of the 31st European Conference on Cognitive Ergonomics, https://doi.org/10.1145/3335082.3335114.
Permalink -

https://repository.uel.ac.uk/item/87522

Download files


Accepted author manuscript
ecce2019_32_preprint_nopub.pdf
License: All rights reserved
File access level: Anyone

  • 230
    total views
  • 172
    total downloads
  • 6
    views this month
  • 4
    downloads this month

Export as

Related outputs

A Frequency Bin Analysis of Distinctive Ranges Between Human and Deepfake Generated Voices
Maltby, H., Wall, J., Glackin, C., Moniri, M., Cannings, N. and Salami, I. 2024. A Frequency Bin Analysis of Distinctive Ranges Between Human and Deepfake Generated Voices. 2024 International Joint Conference on Neural Networks (IJCNN) - Neural Networks Models. Yokohama, Japan 30 Jun - 05 Jul 2024 IEEE.
A reinforcement learning recommender system using bi-clustering and Markov Decision Process
Iftikhar, A., Ghazanfar, M. A., Ayub, M., Alahmari, S. A., Qazi, N. and Wall, J. 2024. A reinforcement learning recommender system using bi-clustering and Markov Decision Process. Expert Systems with Applications. 237 (Art.), p. 121541. https://doi.org/10.1016/j.eswa.2023.121541
Analysis of Deep Neural Networks for Military Target Classification using Synthetic Aperture Radar Images
Jacob, S., Wall, J. and Sharif, S. 2023. Analysis of Deep Neural Networks for Military Target Classification using Synthetic Aperture Radar Images. 3ICT 2023: International Conference on Innovation and Intelligence for Informatics, Computing, and Technologies. University of Bahrain, Bahrain 20 - 21 Nov 2023 IEEE. https://doi.org/10.1109/3ICT60104.2023.10391600
Enhancing Automatic Speech Recognition Quality with a Second-Stage Speech Enhancement Generative Adversarial Network
Nossier, S. A., Wall, J., Moniri, M., Glackin, C. and Cannings, N. 2023. Enhancing Automatic Speech Recognition Quality with a Second-Stage Speech Enhancement Generative Adversarial Network. The 35th IEEE International Conference on Tools with Artificial Intelligence (ICTAI). Atlanta, Georgia (USA) 06 - 08 Nov 2023 IEEE Computer Society. https://doi.org/10.1109/ICTAI59109.2023.00087
A Deep Learning Speech Enhancement Architecture Optimised for Speech Recognition and Hearing Aids
Nossier, S. A., Wall, J., Moniri, M., Glackin, C. and Cannings, N. 2023. A Deep Learning Speech Enhancement Architecture Optimised for Speech Recognition and Hearing Aids. The 35th IEEE International Conference on Tools with Artificial Intelligence (ICTAI). Atlanta, Georgia (USA) 06 - 08 Nov 2023 IEEE Computer Society. https://doi.org/10.1109/ICTAI59109.2023.00088
An Extended Reality Solution for Mitigating the Video Fatigue of Online Meetings
Glackin, C., Cannings, N., Poobalasingam, V., Wall, J., Sharif, S. and Moniri, M. 2023. An Extended Reality Solution for Mitigating the Video Fatigue of Online Meetings. in: Jung, T. and tom Dieck, M. C. (ed.) XR-Metaverse Cases: Business Application of AR, VR, XR and Metaverse Springer. pp. 45-54
Short Utterance Dialogue Act Classification Using a Transformer Ensemble
Maltby, H., Wall, J., Goodluck Constance, T., Moniri, M., Glackin, C., Rajwadi, M. and Cannings, N. 2023. Short Utterance Dialogue Act Classification Using a Transformer Ensemble. UA-DIGITAL 2023: UA Digital Theme Research Twinning. Online virtual conference 27 - 31 Mar 2023
An Innovative Approach Based on Machine Learning to Evaluate the Risk Factors Importance in Diagnosing Keratoconus
Zorto, A. D., Sharif, S., Wall, J., Brahma, A., Alzahrani, A. I. and Alalwan, N. 2023. An Innovative Approach Based on Machine Learning to Evaluate the Risk Factors Importance in Diagnosing Keratoconus. Informatics in Medicine Unlocked. 38, p. 101208. https://doi.org/10.1016/j.imu.2023.101208
Deception Detection in Conversations using the Proximity of Linguistic Markers
Bajaj, N., Rajwadi, M., Goodluck Constance, T., Wall, J., Moniri, M., Laird, T., Woodruff, C., Laird, J., Glackin, C. and Cannings, N. 2023. Deception Detection in Conversations using the Proximity of Linguistic Markers. Knowledge-Based Systems. 23 (Art. 110422). https://doi.org/10.1016/j.knosys.2023.110422
Improving data quality assessment of connected vehicles data with machine learning and statistical methods
Wall, J., Wondie, M. and Li, Y. 2022. Improving data quality assessment of connected vehicles data with machine learning and statistical methods. Pan African Conference on Artifical Intelligence 2022. 04 - 05 Oct 2022
A Machine Learning Approach to Identify the Preferred Representational System of a Person
Amirhosseini, M. and Wall, J. 2022. A Machine Learning Approach to Identify the Preferred Representational System of a Person. Multimodal Technologies and Interaction. 6 (12), p. 112. https://doi.org/10.3390/mti6120112
Speaker Recognition using Multiple X-Vector Speaker Representations with Two-Stage Clustering and Outlier Detection Refinement
Wall, J., Shrestha, R., Glackin, C., Cannings, N., Rajwadi, M., Kada, S., Laird, J., Laird, T. and Woodruff, C. 2022. Speaker Recognition using Multiple X-Vector Speaker Representations with Two-Stage Clustering and Outlier Detection Refinement. CyberSciTech 2022: IEEE Cyber Science and Technology Congress. Calabria, Italy 12 - 15 Sep 2022 IEEE. https://doi.org/10.1109/DASC/PiCom/CBDCom/Cy55231.2022.9927875
Convolutional Recurrent Smart Speech Enhancement Architecture for Hearing Aids
Abdallah Abdelhafiz Nossier, S., Wall, J., Moniri, M., Glackin, C. and Cannings, N. 2022. Convolutional Recurrent Smart Speech Enhancement Architecture for Hearing Aids. INTERSPEECH 2022. Incheon, Korea 18 - 22 Sep 2022
Two-Stage Deep Learning Approach for Speech Enhancement and Reconstruction in The Frequency and Time Domains
Nossier, S. A., Wall, J., Moniri, M., Glackin, C. and Cannings, N. 2022. Two-Stage Deep Learning Approach for Speech Enhancement and Reconstruction in The Frequency and Time Domains. WCCI 2022: IEEE World Congress on Computational Intelligence. Padua, Italy 23 May - 18 Jul 2022 IEEE. https://doi.org/10.1109/IJCNN55064.2022.9892355
A Mixed Reality Approach for dealing with the Video Fatigue of Online Meetings
Wall, J., Poobalasingam, V., Sharif, S., Moniri, M., Glackin, C. and Cannings, N. 2022. A Mixed Reality Approach for dealing with the Video Fatigue of Online Meetings. 7th International XR Conference. Lisbon, Portugal 27 - 29 Apr 2022
A Conversational AI Approach to Detecting Deception and Tackling Insurance Fraud
Wall, J. 2021. A Conversational AI Approach to Detecting Deception and Tackling Insurance Fraud. Tenth International Conference on Intelligent Computing and Information Systems (ICICIS). Cairo, Egypt 05 - 07 Dec 2021 IEEE. https://doi.org/10.1109/ICICIS52592.2021.9694118
Bird Audio Diarization with Faster R-CNN
Shrestha, R., Glackin, C., Wall, J. and Cannings, N. 2021. Bird Audio Diarization with Faster R-CNN. 30th International Conference on Artificial Neural Networks (ICANN). Online 14 - 17 Sep 2021 Springer. https://doi.org/10.1007/978-3-030-86362-3_34
Resolving Ambiguity in Hedge Detection by Automatic Generation of Linguistic Rules
Goodluck Constance, T., Bajaj, N., Rajwadi, M., Maltby, H., Wall, J., Moniri, M., Woodruff, C., Laird, T., Laird, J., Glackin, C. and Cannings, N. 2021. Resolving Ambiguity in Hedge Detection by Automatic Generation of Linguistic Rules. 30th International Conference on Artificial Neural Networks (ICANN). Online 14 - 17 Sep 2021 Springer. https://doi.org/10.1007/978-3-030-86383-8_30
An Experimental Analysis of Deep Learning Architectures for Supervised Speech Enhancement
Nossier, S. A., Wall, J., Moniri, M., Glackin, C. and Cannings, N. 2020. An Experimental Analysis of Deep Learning Architectures for Supervised Speech Enhancement. Electronics. 10 (Art. 17). https://doi.org/10.3390/electronics10010017
Mapping and Masking Targets Comparison using Different Deep Learning based Speech Enhancement Architectures
Abdallah Abdelhafiz Nossier, S., Wall, J., Moniri, M., Glackin, C. and Cannings, N. 2020. Mapping and Masking Targets Comparison using Different Deep Learning based Speech Enhancement Architectures. 2020 International Joint Conference on Neural Networks (IJCNN). Glasgow, UK 19 - 24 Jul 2020 IEEE. https://doi.org/10.1109/IJCNN48605.2020.9206623
A Comparative Study of Time and Frequency Domain Approaches to Deep Learning based Speech Enhancement
Abdallah Abdelhafiz Nossier, S., Wall, J., Moniri, M., Glackin, C. and Cannings, N. 2020. A Comparative Study of Time and Frequency Domain Approaches to Deep Learning based Speech Enhancement. 2020 International Joint Conference on Neural Networks (IJCNN). Glasgow, UK 19 - 24 Jul 2020 IEEE. https://doi.org/10.1109/IJCNN48605.2020.9206928
Fraud detection in telephone conversations for financial services using linguistic features
Bajaj, N., Goodluck Constance, T., Rajwadi, M., Wall, J., Moniri, M., Glackin, C., Cannings, N., Woodruff, C. and Laird, J. 2019. Fraud detection in telephone conversations for financial services using linguistic features. Neural Information Processing Systems - NeurIPS 2019. Vancouver, Canada 08 - 14 Dec 2019 NeurIPS.
A Framework for Augmented Reality Based Shared Experiences
Ali, A., Glackin, C., Cannings, N., Wall, J., Sharif, S. and Moniri, M. 2019. A Framework for Augmented Reality Based Shared Experiences. Immersive Learning Research Network - iLRN. London, UK 23 - 27 Jun 2019 Technischen Universität Graz. https://doi.org/10.3217/978-3-85125-657-4-24
Explaining Sentiment Classification
Rajwadi, M., Glackin, C., Wall, J., Chollet, G. and Cannings, N. 2019. Explaining Sentiment Classification. Interspeech 2019. Graz, AT 15 - 19 Sep 2019 International Speech Communication Association. https://doi.org/10.21437/Interspeech.2019-2743
Towards a More Representative Definition of Cyber Security
Schatz, Daniel, Bashroush, R. and Wall, J. 2017. Towards a More Representative Definition of Cyber Security. Journal of Digital Forensics, Security and Law. 12 (2), pp. 53-74. https://doi.org/10.15394/jdfsl.2017.1476
Solving the Linearly Inseparable XOR Problem with Spiking Neural Networks
Wall, J. and Reljan-Delaney, M. 2017. Solving the Linearly Inseparable XOR Problem with Spiking Neural Networks . SAI Computing Conference 2017. London, UK 18 - 20 Jul 2017 IEEE. https://doi.org/10.1109/SAI.2017.8252173
Privacy preserving encrypted phonetic search of speech data
Wall, J., Glackin, C., Chollet, G., Dugan, N., Cannings, N., Tahir, S., Ghosh Ray, I. and Rajarajan, M. 2017. Privacy preserving encrypted phonetic search of speech data. 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Louisiana, USA 05 - 09 Mar 2017 IEEE. pp. 6414-6418 https://doi.org/10.1109/ICASSP.2017.7953391
Spiking neuron models of the medial and lateral superior olive for sound localisation
Wall, J., McDaid, L.J., Maguire, L.P. and McGinnity, T.M. 2008. Spiking neuron models of the medial and lateral superior olive for sound localisation. IEEE International Joint Conference on Neural Networks (IJCNN) (IEEE World Congress on Computational Intelligence). Hong Kong 01 - 08 Jun 2008 Hong Kong IEEE. pp. 2641-2647 https://doi.org/10.1109/IJCNN.2008.4634168
A comparison of sound localisation techniques using cross-correlation and spiking neural networks for mobile robotics
Wall, J., McGinnity, Thomas M. and Maguire, Liam P. 2011. A comparison of sound localisation techniques using cross-correlation and spiking neural networks for mobile robotics. Neural Networks (IJCNN), The 2011 International Joint Conference on. San Jose, CA 31 Jul - 05 Aug 2011 IEEE. pp. 1981-1987
Deep Laterally Recurrent Spiking Neural Networks for Speech Enhancement
Wall, J. 2016. Deep Laterally Recurrent Spiking Neural Networks for Speech Enhancement. UEL Computing & Engineering Showcase. London, UK 16 Jun 2016 UEL.
A spiking neural network implementation of sound localisation
Wall, J., McDaid, L.J., Maguire, L.P. and McGinnity, T.M. 2007. A spiking neural network implementation of sound localisation. IET Irish Signals and Systems. Derry, UK 13 - 14 Sep 2007 Derry, UK pp. 1-5
Using the interaural time difference and cross-correlation to localise short-term complex noises
Wall, J., McGinnity, Martin and Maguire, Liam 2011. Using the interaural time difference and cross-correlation to localise short-term complex noises. Artificial Intelligence and Cognitive Science (AICS). Derry, UK 31 Aug - 02 Sep 2011 University of Ulster, Intelligent Systems Research Centre.
A Framework for Realistic 3D Tele-Immersion
Fechteler, P., Hilsmann, A., Eisert, P., Broeck, S.V., Stevens, C., Wall, J., Sanna, M., Mauro, D.A., Kuijk, F., Mekuria, R., Cesar, P., Monaghan, D., O'Connor, N.E., Daras, P., Alexiadis, D. and Zahariadis, T. 2013. A Framework for Realistic 3D Tele-Immersion. 6th International Conference on Computer Vision / Computer Graphics Collaboration Techniques and Applications. Berlin, Germany 2013 New York, NY, USA Association for Computing Machinery (ACM). pp. 1-8 https://doi.org/10.1145/2466715.2466718
Spiking Neural Network Connectivity and its Potential for Temporal Sensory Processing and Variable Binding
Wall, J. and Glackin, Cornelius 2013. Spiking Neural Network Connectivity and its Potential for Temporal Sensory Processing and Variable Binding. Frontiers Media SA.
A Roadmap for Privacy Preserving Speech Processing
Wall, J., Glackin, C., Chollet, G., Dugan, N., Cannings, N., Tahir, S., Ghosh Ray, I, Rajarajan, M., Falkner, R. and Badii, A. 2016. A Roadmap for Privacy Preserving Speech Processing. Preserving Privacy in an Age of Increased Surveillance – A Biometrics Perspective. London, UK 17 - 17 Oct 2016
Recurrent lateral inhibitory spiking networks for speech enhancement
Wall, J., Glackin, Cornelius, Cannings, Nigel, Chollet, Gerard and Dugan, Nazim 2016. Recurrent lateral inhibitory spiking networks for speech enhancement. IEEE International Joint Conference on Neural Networks (IJCNN). Vancouver, Canada 24 - 29 Jul 2016 IEEE. pp. 1023-1028 https://doi.org/10.1109/IJCNN.2016.7727310
Post-Cochlear Auditory Modelling for Sound Localisation using Bio-Inspired Techniques
Wall, J. 2010. Post-Cochlear Auditory Modelling for Sound Localisation using Bio-Inspired Techniques. PhD Thesis University of Ulster Faculty of Computing and Engineering
Fuzzy Ensembles for Embedding Adaptive Behaviours in Semi-Autonomous Avatars in 3D Virtual Worlds
Wall, J., Izquierdo, E. and Zhang, Q. 2013. Fuzzy Ensembles for Embedding Adaptive Behaviours in Semi-Autonomous Avatars in 3D Virtual Worlds. in: Proceedings 2013 18th International Conference on Digital Signal Processing (DSP) IEEE. pp. 1-6
Advancements and Challenges towards a Collaborative Framework for 3D Tele-Immersive Social Networking
Mauro, D.A., O'Connor, N.E., Monaghan, D., Gowing, M., Fechteler, P., Eisert, P., Wall, J., Izquierdo, E., Alexiadis, D.S., Daras, P., Mekuria, R. and Cesar, P. 2013. Advancements and Challenges towards a Collaborative Framework for 3D Tele-Immersive Social Networking. 4th IEEE International Workshop on Hot Topics in 3D (Hot3D). San Jose, CA, USA 15 Jul 2013 IEEE. pp. 1-2
A Framework for Human-like Behavior in an Immersive Virtual World
Kuijk, Fons, Van Broeck, Sigurd, Dareau, Claude, Ravenet, Brian, Ochs, Magalie, Apostolakis, Konstantinos, Daras, Petros, Monaghan, David, O'Connor, Noel E, Wall, J. and Izquierdo, Ebroul 2013. A Framework for Human-like Behavior in an Immersive Virtual World. in: Proceedings of 2013 18th International Conference on Digital Signal Processing (DSP) IEEE. pp. 1-7
REVERIE: Natural Human Interaction in Virtual Immersive Environments
Wall, J., Izquierdo, Ebroul, Argyriou, Lemonia, Monaghan, David S., O'Connor, Noel E., Poulakos, Steven, Smolic, Aljoscha and Mekuria, Rufael 2014. REVERIE: Natural Human Interaction in Virtual Immersive Environments. in: 2014 IEEE International Conference on Image Processing (ICIP) IEEE. pp. 2165-2167
Spiking neural network model of sound localisation using the interaural intensity difference
Wall, J., McDaid, Liam J., Maguire, Liam P. and McGinnity, Thomas M. 2012. Spiking neural network model of sound localisation using the interaural intensity difference. IEEE Transactions on Neural Networks. 23 (4), pp. 574-586.
Perception-based Modelling of System Behaviour
Wall, J. 2006. Perception-based Modelling of System Behaviour. Proc. of the IEEE Systems, Man and Cybernetics Society.
A Spiking Neural Network Model of the Medial Superior Olive using Spike Timing Dependent Plasticity for Sound Localisation
Glackin, B., Wall, J., McGinnity, T.M., Maguire, L.P. and McDaid, L.J. 2010. A Spiking Neural Network Model of the Medial Superior Olive using Spike Timing Dependent Plasticity for Sound Localisation. Frontiers in Computational Neuroscience. 4 (18), pp. 1-16.
Spiking neural network connectivity and its potential for temporal sensory processing and variable binding
Wall, J. and Glackin, Cornelius 2013. Spiking neural network connectivity and its potential for temporal sensory processing and variable binding. Frontiers in Computational Neuroscience. 7 (182), pp. 1-2.
A Methodological Approach to User Evaluation and Assessment of a Virtual Environment Hangout
Pasin, Marco, Frisiello, Antonella, Wall, J., Poulakos, Steven and Smolic, Aljoscha 2015. A Methodological Approach to User Evaluation and Assessment of a Virtual Environment Hangout. in: Sanna, Andrea, Lamberti, Fabrizio, Rokne, Jon and Gatteschi, Valentina (ed.) Proceedings of the 7th International Conference on Intelligent Technologies for Interactive Entertainment EAI. pp. 1-5
Playing immersive games on the REVERIE platform
Doumanis, Ioannis, Wall, J. and Monaghan, David S. 2015. Playing immersive games on the REVERIE platform. in: Computer and Information Technology; Ubiquitous Computing and Communications; Dependable, Autonomic and Secure Computing; Pervasive Intelligence and Computing (CIT/IUCC/DASC/PICOM) IEEE. pp. 1572-1577