Which phoneme-to-viseme maps best improve visual-only computer lip-reading?
Book chapter
Bear, Y., Harvey, Richard W., Theobald, Barry-John and Lan, Yuxuan 2014. Which phoneme-to-viseme maps best improve visual-only computer lip-reading? in: Bebis, George, Boyle, Richard, Parvin, Bahram, Koracin, Darko, McMahan, Ryan, Jerald, Jason, Zhang, Hui, Drucker, Steven M., Kambhamettu, Chandra, Choubassi, Maha El, Deng, Zhigang and Carlson, Mark (ed.) Advances in Visual Computing: 10th International Symposium, ISVC 2014, Las Vegas, NV, USA, December 8-10, 2014, Proceedings, Part II Springer International Publishing.
Authors | Bear, Y., Harvey, Richard W., Theobald, Barry-John and Lan, Yuxuan |
---|---|
Editors | Bebis, George, Boyle, Richard, Parvin, Bahram, Koracin, Darko, McMahan, Ryan, Jerald, Jason, Zhang, Hui, Drucker, Steven M., Kambhamettu, Chandra, Choubassi, Maha El, Deng, Zhigang and Carlson, Mark |
Abstract | A critical assumption of all current visual speech recognition systems is that there are visual speech units called visemes which can be mapped to units of acoustic speech, the phonemes. Despite there being a number of published maps it is infrequent to see the effectiveness of these tested, particularly on visual-only lip-reading (many works use audio-visual speech). Here we examine 120 mappings and consider if any are stable across talkers. We show a method for devising maps based on phoneme confusions from an automated lip-reading system, and we present new mappings that show improvements for individual talkers. |
Book title | Advances in Visual Computing: 10th International Symposium, ISVC 2014, Las Vegas, NV, USA, December 8-10, 2014, Proceedings, Part II |
Year | 2014 |
Publisher | Springer International Publishing |
Publication dates | |
Dec 2014 | |
Publication process dates | |
Deposited | 28 Feb 2017 |
Event | 10th International Symposium, ISVC 2014 |
ISBN | 978-3-319-14363-7 |
978-3-319-14364-4 | |
ISSN | 0302-9743 |
Digital Object Identifier (DOI) | https://doi.org/10.1007/978-3-319-14364-4_22 |
Web address (URL) | http://doi.org/10.1007/978-3-319-14364-4_22 |
Additional information | Bear, H.L., Harvey, R.W., Theobald, B.J. and Lan, Y., 2014, December. Which phoneme-to-viseme maps best improve visual-only computer lip-reading?. In International Symposium on Visual Computing (pp. 230-239). Springer International Publishing. |
Journal | Lecture Notes in Computer Science |
Journal citation | 8888, pp. 230-239 |
Accepted author manuscript | License CC BY-NC-ND |
https://repository.uel.ac.uk/item/85837
Download files
115
total views1129
total downloads3
views this month21
downloads this month