Summary
Audio-visual (AV) Automatic Speech Recognition (ASR) in unconstrained (in-the-wild) videos collected from real-world multimedia databases (outdoor conversation/interviews, TV shows with multiple speakers) using novel deep learning methodologies and architectures.IMPORTANCE FOR...
More information & hyperlinks
Web resources: | http://www.talking-heads.eu |