Visual gesture variability between talkers in continuous speech

Conference paper

Bear, Y. 2017. Visual gesture variability between talkers in continuous speech. 28th British Machine Vision Conference. London, UK 04 - 07 Sep 2017 BMVA Press.

Publication dates
Authors	Bear, Y.
Type	Conference paper
Abstract	Recent adoption of deep learning methods to the field of machine lipreading research gives us two options to pursue to improve system performance. Either, we develop endtoend systems holistically or, we experiment to further our understanding of the visual speech signal. The latter option is more difficult but this knowledge would enable researchers to both improve systems and apply the new knowledge to other domains such as speech therapy. One challenge in lipreading systems is the correct labeling of the classifiers. These labels map an estimated function between visemes on the lips and the phonemes uttered. Here we ask if such maps are speaker-dependent? Prior work investigated isolated word recognition from speaker-dependent (SD) visemes, we extend this to continuous speech. Benchmarked against SD results, and the isolated words performance, we test with RMAV dataset speakers and observe that with continuous speech, the trajectory between visemes has a greater negative effect on the speaker differentiation.
Year	2017
Conference	28th British Machine Vision Conference
Publisher	BMVA Press
Publisher's version	Visual gesture variability between talkers in continuous speech.pdf License CC BY-ND
Print	Sep 2017
Publication process dates
Deposited	24 Aug 2017
Accepted	Jul 2017
Book title	Proceedings of British Machine Vision Conference
Web address (URL)	http://www.bmva.org/bmvc/2017/toc.html
Additional information	© 2017 The author

Permalink -

https://repository.uel.ac.uk/item/84qv0

Download files

Publisher's version

	Visual gesture variability between talkers in continuous speech.pdf
License: CC BY-ND

200
total views
108
total downloads
5
views this month
3
downloads this month

Export as

Related outputs

Resolution limits on visual speech recognition

Bear, Y., Harvey, Richard, Theobald, Barry-John and Lan, Yuxuan 2014. Resolution limits on visual speech recognition. in: IEEE International Conference on Image Processing (ICIP) IEEE.

Some observations on computer lip-reading: moving from the dream to the reality

Bear, Y., Owen, Gari, Harvey, Richard and Theobald, Barry-John 2014. Some observations on computer lip-reading: moving from the dream to the reality. Proceedings of SPIE. 9253. https://doi.org/10.1117/12.2067464

Which phoneme-to-viseme maps best improve visual-only computer lip-reading?

Bear, Y., Harvey, Richard W., Theobald, Barry-John and Lan, Yuxuan 2014. Which phoneme-to-viseme maps best improve visual-only computer lip-reading? in: Bebis, George, Boyle, Richard, Parvin, Bahram, Koracin, Darko, McMahan, Ryan, Jerald, Jason, Zhang, Hui, Drucker, Steven M., Kambhamettu, Chandra, Choubassi, Maha El, Deng, Zhigang and Carlson, Mark (ed.) Advances in Visual Computing: 10th International Symposium, ISVC 2014, Las Vegas, NV, USA, December 8-10, 2014, Proceedings, Part II Springer International Publishing.

Decoding visemes: Improving machine lip-reading

Bear, Y. and Harvey, Richard 2016. Decoding visemes: Improving machine lip-reading. in: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) IEEE.

Finding phonemes: improving machine lip-reading

Bear, Y., Harvey, Richard W. and Lan, Yuxuan 2015. Finding phonemes: improving machine lip-reading. FAAVSP - The 1st Joint Conference on Facial Analysis, Animation and Auditory-Visual Speech Processing. Education Centre of the Jesuits, Vienna, Austria 11 - 13 Sep 2015 International Speech Communication Association. pp. 115-120

Speaker-independent machine lip-reading with speaker-dependent viseme classifiers

Bear, Y., Cox, Stephen J. and Harvey, Richard W. 2015. Speaker-independent machine lip-reading with speaker-dependent viseme classifiers. FAAVSP - The 1st Joint Conference on Facial Analysis, Animation, and Auditory-Visual Speech Processing. Education Centre of the Jesuits, Vienna, Austria 11 - 13 Sep 2015 International Speech Communication Association. pp. 190-195