Which phoneme-to-viseme maps best improve visual-only computer lip-reading?

Book chapter

Bear, Y., Harvey, Richard W., Theobald, Barry-John and Lan, Yuxuan 2014. Which phoneme-to-viseme maps best improve visual-only computer lip-reading? in: Bebis, George, Boyle, Richard, Parvin, Bahram, Koracin, Darko, McMahan, Ryan, Jerald, Jason, Zhang, Hui, Drucker, Steven M., Kambhamettu, Chandra, Choubassi, Maha El, Deng, Zhigang and Carlson, Mark (ed.) Advances in Visual Computing: 10th International Symposium, ISVC 2014, Las Vegas, NV, USA, December 8-10, 2014, Proceedings, Part II Springer International Publishing.

Publication dates
Authors	Bear, Y., Harvey, Richard W., Theobald, Barry-John and Lan, Yuxuan
Editors	Bebis, George, Boyle, Richard, Parvin, Bahram, Koracin, Darko, McMahan, Ryan, Jerald, Jason, Zhang, Hui, Drucker, Steven M., Kambhamettu, Chandra, Choubassi, Maha El, Deng, Zhigang and Carlson, Mark
Abstract	A critical assumption of all current visual speech recognition systems is that there are visual speech units called visemes which can be mapped to units of acoustic speech, the phonemes. Despite there being a number of published maps it is infrequent to see the effectiveness of these tested, particularly on visual-only lip-reading (many works use audio-visual speech). Here we examine 120 mappings and consider if any are stable across talkers. We show a method for devising maps based on phoneme confusions from an automated lip-reading system, and we present new mappings that show improvements for individual talkers.
Book title	Advances in Visual Computing: 10th International Symposium, ISVC 2014, Las Vegas, NV, USA, December 8-10, 2014, Proceedings, Part II
Year	2014
Publisher	Springer International Publishing
Print	Dec 2014
Publication process dates
Deposited	28 Feb 2017
Event	10th International Symposium, ISVC 2014
ISBN	978-3-319-14363-7
	978-3-319-14364-4
ISSN	0302-9743
Digital Object Identifier (DOI)	https://doi.org/10.1007/978-3-319-14364-4_22
Web address (URL)	http://doi.org/10.1007/978-3-319-14364-4_22
Additional information	Bear, H.L., Harvey, R.W., Theobald, B.J. and Lan, Y., 2014, December. Which phoneme-to-viseme maps best improve visual-only computer lip-reading?. In International Symposium on Visual Computing (pp. 230-239). Springer International Publishing.
Journal	Lecture Notes in Computer Science
Journal citation	8888, pp. 230-239
Accepted author manuscript	Which phoneme-to-viseme.pdf License CC BY-NC-ND

Permalink -

https://repository.uel.ac.uk/item/85837

Download files

203
total views
1337
total downloads
6
views this month
5
downloads this month

Export as

Related outputs

Resolution limits on visual speech recognition

Bear, Y., Harvey, Richard, Theobald, Barry-John and Lan, Yuxuan 2014. Resolution limits on visual speech recognition. in: IEEE International Conference on Image Processing (ICIP) IEEE.

Some observations on computer lip-reading: moving from the dream to the reality

Bear, Y., Owen, Gari, Harvey, Richard and Theobald, Barry-John 2014. Some observations on computer lip-reading: moving from the dream to the reality. Proceedings of SPIE. 9253. https://doi.org/10.1117/12.2067464

Decoding visemes: Improving machine lip-reading

Bear, Y. and Harvey, Richard 2016. Decoding visemes: Improving machine lip-reading. in: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) IEEE.

Finding phonemes: improving machine lip-reading

Bear, Y., Harvey, Richard W. and Lan, Yuxuan 2015. Finding phonemes: improving machine lip-reading. FAAVSP - The 1st Joint Conference on Facial Analysis, Animation and Auditory-Visual Speech Processing. Education Centre of the Jesuits, Vienna, Austria 11 - 13 Sep 2015 International Speech Communication Association. pp. 115-120

Speaker-independent machine lip-reading with speaker-dependent viseme classifiers

Bear, Y., Cox, Stephen J. and Harvey, Richard W. 2015. Speaker-independent machine lip-reading with speaker-dependent viseme classifiers. FAAVSP - The 1st Joint Conference on Facial Analysis, Animation, and Auditory-Visual Speech Processing. Education Centre of the Jesuits, Vienna, Austria 11 - 13 Sep 2015 International Speech Communication Association. pp. 190-195

Which phoneme-to-viseme maps best improve visual-only computer lip-reading?

Download files

203

1337

6

5

Export as

Related outputs

Resolution limits on visual speech recognition

Some observations on computer lip-reading: moving from the dream to the reality

Decoding visemes: Improving machine lip-reading

Finding phonemes: improving machine lip-reading

Speaker-independent machine lip-reading with speaker-dependent viseme classifiers

Phoneme-to-viseme mappings: the good, the bad, and the ugly

Comparing phonemes and visemes with DNN-based lipreading

Visual speech recognition: aligning terminologies for better understanding

Visual gesture variability between talkers in continuous speech