TalkingHeads TalkingHeads: Audiovisual Speech Recognition in-the-wild

Summary

Audio-visual speech recognition refers to the problem of recognizing speech using both audio and video information. Speech is not a purely auditory process but the way that the listener perceives it is also through the recognition of the visual patterns associated with the mouth movement. This correlation of the audio-visual information has been occasionally explored in literature in order to develop more robust automatic speech recognition systems for cases in which the auditory environment is noisy (e.g. background noise, multiple speakers). However, the problem of audio-visual speech recognition has been mainly studied in controlled, laboratory conditions. TalkingHeads proposes, for the first time, the problem of audio-visual speech recognition in unconstrained (in-the-wild) videos collected from real-world multimedia databases and a set of methodologies that will work well under the assumed in-the-wild setting.

TalkingHeads brings together a talented but experienced researcher (ER) with expertise in speech analysis (diarization and recognition) and the Supervisor with large research experience in Computer Vision for face analysis in-the-wild (recognition, detection, alignment and tracking, and facial expression analysis). TalkingHeads will establish the ER as an independent and internationally recognized researcher in the area of audio-visual fusion and speech recognition. Through TalkingHeads’ achievable work plan, the ER will attain a high level of research maturity by (a) complementing his expertise on speech analysis through extensive training in Computer Vision, (b) conducting research on a challenging research problem (audio-visual speech recognition in-the-wild) with significant career opportunities in both the academia and the industry, (c) publishing at high impact factor conferences and journals, (d) establishing a network of research collaborators, and (e) enhancing personal skills (e.g. supervisory experience, leadership and management skills).

Resources

Show all and search (8)

Unfold all

Fold all

More information & hyperlinks

Web resources:	https://cordis.europa.eu/project/id/706668
Start date:	01-06-2016
End date:	31-05-2018
Total budget - Public funding:	183 454,80 Euro - 183 454,00 Euro