Summary
Massive Open Online Courses have been growing rapidly in size and impact. Yet the language barrier constitutes a major growth impediment in reaching out to all peoples and educating all citizens. TraMOOC aims at tackling this impediment by developing high-quality translation of all types of text genre included in MOOCs (e.g. assignments, tests, presentations, lecture subtitles, blog text) from English into eleven European and BRIC languages (DE, IT, PT, EL, DU, CS, BG, CR, PL, RU, ZH) that constitute strong use cases, are hard to translate into and have weak MT support, thus complying with the call objectives. Phrase-based and syntax-based statistical machine translation models will be developed for addressing language diversity and supporting the language-independent nature of the methodology. For a high quality, automatic translation approach and for adding value to existing infrastructure, extensive advanced bootstrapping of new resources will be performed. An innovative multi-modal automatic and human evaluation schema will further ensure translation quality. For human evaluation, an innovative, strict-access control, time- and cost-efficient crowdsourcing setup will be used. Translation experts, domain experts and end users will also be involved. Separate task mining applications will be employed for implicit translation evaluation: (i) topic detection will be applied to source and translated texts and the resulting entity lists will be compared, leading to further qualitative and quantitative translation evaluation results; (ii) sentiment analysis performed on MOOC users’ blog posts will reveal end user opinion/evaluation regarding translation quality. Results will be combined into a feedback vector and used to refine parallel data and retrain translation models towards a more accurate second-phase translation output. The project results will be showcased and tested on the Iversity MOOC platform and on the VideoLectures.NET digital video lecture library.
Unfold all
/
Fold all
More information & hyperlinks
Web resources: | https://cordis.europa.eu/project/id/644333 |
Start date: | 01-02-2015 |
End date: | 31-01-2018 |
Total budget - Public funding: | 3 223 835,00 Euro - 3 081 147,00 Euro |
Cordis data
Original description
Massive Open Online Courses have been growing rapidly in size and impact. Yet the language barrier constitutes a major growth impediment in reaching out to all peoples and educating all citizens. TraMOOC aims at tackling this impediment by developing high-quality translation of all types of text genre included in MOOCs (e.g. assignments, tests, presentations, lecture subtitles, blog text) from English into eleven European and BRIC languages (DE, IT, PT, EL, DU, CS, BG, CR, PL, RU, ZH) that constitute strong use cases, are hard to translate into and have weak MT support, thus complying with the call objectives. Phrase-based and syntax-based statistical machine translation models will be developed for addressing language diversity and supporting the language-independent nature of the methodology. For a high quality, automatic translation approach and for adding value to existing infrastructure, extensive advanced bootstrapping of new resources will be performed. An innovative multi-modal automatic and human evaluation schema will further ensure translation quality. For human evaluation, an innovative, strict-access control, time- and cost-efficient crowdsourcing setup will be used. Translation experts, domain experts and end users will also be involved. Separate task mining applications will be employed for implicit translation evaluation: (i) topic detection will be applied to source and translated texts and the resulting entity lists will be compared, leading to further qualitative and quantitative translation evaluation results; (ii) sentiment analysis performed on MOOC users’ blog posts will reveal end user opinion/evaluation regarding translation quality. Results will be combined into a feedback vector and used to refine parallel data and retrain translation models towards a more accurate second-phase translation output. The project results will be showcased and tested on the Iversity MOOC platform and on the VideoLectures.NET digital video lecture library.Status
CLOSEDCall topic
ICT-17-2014Update Date
27-10-2022
Images
No images available.
Geographical location(s)
Structured mapping
Unfold all
/
Fold all
H2020-EU.2.1.1. INDUSTRIAL LEADERSHIP - Leadership in enabling and industrial technologies - Information and Communication Technologies (ICT)