Static Sign Language Recognition Using Segmented Images and HOG on Cluttered Backgrounds
Book chapter
Sadeghzadeh, A., Islam, B. and Ahad, M. A. R. 2024. Static Sign Language Recognition Using Segmented Images and HOG on Cluttered Backgrounds. in: Ahad, M. A. R., Inoue, S., Lopez, G. and Hossain, T. (ed.) Human Activity and Behavior Analysis: Advances in Computer Vision and Sensors: Volume 2 Boca Raton, Florida CRC Press: Taylor & Francis Group. pp. 23-45
Authors | Sadeghzadeh, A., Islam, B. and Ahad, M. A. R. |
---|---|
Editors | Ahad, M. A. R., Inoue, S., Lopez, G. and Hossain, T. |
Abstract | Sign language (SL) is of great importance for hearing-impaired and deaf community as their primary communication means. Large variations in the available SLs around the world bring an inevitable necessity for automatic SL interpretation systems to attenuate the communication barrier between the deaf and general public. Despite the existence of numerous innovative studies in this domain, providing an efficient highly accurate system for real-world applications is still challenging especially in the presence of complex backgrounds, low inter-class and large intra-class variations, and changes in illumination conditions. To address these issues, a novel Convolutional Neural Network (CNN)-based static sign language recognition (SLR) system is proposed by gaining the maximum benefits from the segmented hand images and Histogram of Oriented Gradients (HOG) handcrafted features. To this end, a U-Net architecture is trained by a small-scale annotated SL dataset for hand segmentation, which is then successfully applied to the other non-annotated datasets to mitigate the detrimental effects of the complex backgrounds. The robustness of the system against environmental and user-dependent variations is further improved, taking advantage of HOG handcrafted features extracted from the segmented images in the form of 2D images. These generated images are fed into our proposed CNN model whose number of layers and filters, kernel sizes, activation functions, optimization method, learning rate, and regularization techniques are properly selected so that the performance accuracy is maximized. Extensive experiments conducted on three different American Sign Language (ASL) datasets with variations in background and lighting, i.e., MU HandImages ASL (Massey), NUSII, and Static Hand Gesture ASL, with an accuracy of 99.71%, 99.50%, and 100% demonstrate the robustness, superiority and high capabilities of our proposed system over the existing approaches. |
Book title | Human Activity and Behavior Analysis: Advances in Computer Vision and Sensors: Volume 2 |
Page range | 23-45 |
Year | 2024 |
Publisher | CRC Press: Taylor & Francis Group |
File | License File Access Level Repository staff only |
Publication dates | |
Online | 29 Apr 2024 |
Publication process dates | |
Deposited | 15 Aug 2024 |
Place of publication | Boca Raton, Florida |
Series | Ubiquitous Computing, Healthcare and Well-being |
ISBN | 9781032636054 |
9781032598765 | |
Digital Object Identifier (DOI) | https://doi.org/10.1201/978103263605-3 |
Web address (URL) | https://www.routledge.com/9781032598765 |
https://repository.uel.ac.uk/item/8y080
33
total views0
total downloads7
views this month0
downloads this month