Summary
The aim of UTTER is to leverage large language models to build the next generation of multimodal eXtended reality (XR) technologies for transcription, translation, summarisation, and minuting. We will make these technologies scalable, adaptable, contextualised, robust, explainable, and emotion-aware. We will increase the context-sensitivity of the technologies, so they can take into account the full history of the conversation, as well as its wider context. We will introduce confidence-aware models, which can take into account their own limitations. We will develop explainable models, so the human user can know why the model made the decisions it did. We will improve adaptation, so that domain-specific and language-specific models can be quickly rolled out. For these advances we will make use of pre-trained eXtended reality (XR) models, which optimally combine text and speech signals, and are trained efficiently with adapters and prompting. We will also develop efficient methods to deploy such large and complex models, so that they can put into production in an energy-efficient manner. Our use-case prototypes will cover (i) A personal assistant for meetings that can improve communication in the online world and (ii) an advanced customer service assistant to support global markets. These prototypes will be developed and tested throughout the project, with annual releases and evaluations. Through our cascaded grant programme, and our release of tools to facilitate the use of pre-trained XR models, will enable the take-up and development of these technologies throughout Europe.
Unfold all
/
Fold all
More information & hyperlinks
Web resources: | https://cordis.europa.eu/project/id/101070631 |
Start date: | 01-10-2022 |
End date: | 30-09-2025 |
Total budget - Public funding: | 4 074 791,00 Euro - 4 070 321,00 Euro |
Cordis data
Original description
The aim of UTTER is to leverage large language models to build the next generation of multimodal eXtended reality (XR) technologies for transcription, translation, summarisation, and minuting. We will make these technologies scalable, adaptable, contextualised, robust, explainable, and emotion-aware. We will increase the context-sensitivity of the technologies, so they can take into account the full history of the conversation, as well as its wider context. We will introduce confidence-aware models, which can take into account their own limitations. We will develop explainable models, so the human user can know why the model made the decisions it did. We will improve adaptation, so that domain-specific and language-specific models can be quickly rolled out. For these advances we will make use of pre-trained eXtended reality (XR) models, which optimally combine text and speech signals, and are trained efficiently with adapters and prompting. We will also develop efficient methods to deploy such large and complex models, so that they can put into production in an energy-efficient manner. Our use-case prototypes will cover (i) A personal assistant for meetings that can improve communication in the online world and (ii) an advanced customer service assistant to support global markets. These prototypes will be developed and tested throughout the project, with annual releases and evaluations. Through our cascaded grant programme, and our release of tools to facilitate the use of pre-trained XR models, will enable the take-up and development of these technologies throughout Europe.Status
SIGNEDCall topic
HORIZON-CL4-2021-HUMAN-01-13Update Date
09-02-2023
Images
No images available.
Geographical location(s)
Structured mapping