Resolution limits on visual speech recognition

Book chapter

Bear, Y., Harvey, Richard, Theobald, Barry-John and Lan, Yuxuan 2014. Resolution limits on visual speech recognition. in: IEEE International Conference on Image Processing (ICIP) IEEE.

Publication dates
Authors	Bear, Y., Harvey, Richard, Theobald, Barry-John and Lan, Yuxuan
Abstract	Visual-only speech recognition is dependent upon a number of factors that can be difficult to control, such as: lighting; identity; motion; emotion and expression. But some factors, such as video resolution are controllable, so it is surprising that there is not yet a systematic study of the effect of resolution on lip-reading. Here we use a new data set, the Rosetta Raven data, to train and test recognizers so we can measure the affect of video resolution on recognition accuracy. We conclude that, contrary to common practice, resolution need not be that great for automatic lip-reading. However it is highly unlikely that automatic lip-reading can work reliably when the distance between the bottom of the lower lip and the top of the upper lip is less than four pixels at rest.
Keywords	Shape; Accuracy; Hidden Markov models; Visualization; Lips; Active appearance model; Face
Book title	IEEE International Conference on Image Processing (ICIP)
Year	2014
Publisher	IEEE
Print	Oct 2014
Publication process dates
Deposited	10 Mar 2017
Event	IEEE International Conference on Image Processing (ICIP) 2014
ISBN	978-1-4799-5751-4
Web address (URL)	http://ieeexplore.ieee.org/document/7025274/
Additional information	© 2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.
Accepted author manuscript	Resolution_Limits.pdf License CC BY-NC-ND

Permalink -

https://repository.uel.ac.uk/item/858v3

Download files

184
total views
409
total downloads
0
views this month
4
downloads this month

Export as

Related outputs

Some observations on computer lip-reading: moving from the dream to the reality

Bear, Y., Owen, Gari, Harvey, Richard and Theobald, Barry-John 2014. Some observations on computer lip-reading: moving from the dream to the reality. Proceedings of SPIE. 9253. https://doi.org/10.1117/12.2067464

Which phoneme-to-viseme maps best improve visual-only computer lip-reading?

Bear, Y., Harvey, Richard W., Theobald, Barry-John and Lan, Yuxuan 2014. Which phoneme-to-viseme maps best improve visual-only computer lip-reading? in: Bebis, George, Boyle, Richard, Parvin, Bahram, Koracin, Darko, McMahan, Ryan, Jerald, Jason, Zhang, Hui, Drucker, Steven M., Kambhamettu, Chandra, Choubassi, Maha El, Deng, Zhigang and Carlson, Mark (ed.) Advances in Visual Computing: 10th International Symposium, ISVC 2014, Las Vegas, NV, USA, December 8-10, 2014, Proceedings, Part II Springer International Publishing.

Decoding visemes: Improving machine lip-reading

Bear, Y. and Harvey, Richard 2016. Decoding visemes: Improving machine lip-reading. in: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) IEEE.

Finding phonemes: improving machine lip-reading

Bear, Y., Harvey, Richard W. and Lan, Yuxuan 2015. Finding phonemes: improving machine lip-reading. FAAVSP - The 1st Joint Conference on Facial Analysis, Animation and Auditory-Visual Speech Processing. Education Centre of the Jesuits, Vienna, Austria 11 - 13 Sep 2015 International Speech Communication Association. pp. 115-120

Speaker-independent machine lip-reading with speaker-dependent viseme classifiers

Bear, Y., Cox, Stephen J. and Harvey, Richard W. 2015. Speaker-independent machine lip-reading with speaker-dependent viseme classifiers. FAAVSP - The 1st Joint Conference on Facial Analysis, Animation, and Auditory-Visual Speech Processing. Education Centre of the Jesuits, Vienna, Austria 11 - 13 Sep 2015 International Speech Communication Association. pp. 190-195

Resolution limits on visual speech recognition

Download files

184

409

0

4

Export as

Related outputs

Some observations on computer lip-reading: moving from the dream to the reality

Which phoneme-to-viseme maps best improve visual-only computer lip-reading?

Decoding visemes: Improving machine lip-reading

Finding phonemes: improving machine lip-reading

Speaker-independent machine lip-reading with speaker-dependent viseme classifiers

Phoneme-to-viseme mappings: the good, the bad, and the ugly

Comparing phonemes and visemes with DNN-based lipreading

Visual speech recognition: aligning terminologies for better understanding

Visual gesture variability between talkers in continuous speech