LEXICAL | Lexical Acquisition Across Languages

Summary
Due to the growing volume of textual information available in multiple languages, there is a great demand for Natural Language Processing (NLP) techniques that can automatically process and manage multi-lingual texts, supporting information access and communication in core areas of society (e.g. healthcare, business, science). Many NLP tasks and applications rely on task-specific lexicons (e.g. dictionaries, word classifications) for optimal performance. Recently, automatic acquisition of lexicons from relevant texts has proved a promising, cost-effective alternative to manual lexicography. It has the potential to considerably enhance the viability and portability of NLP technology both within and across languages. However, this approach has been explored for a very small number of resource-rich languages only, leaving the vast majority of worlds’ languages without useful technology. The ambitious goal of this project is to take research in lexical acquisition to the level where it can support multi-lingual NLP, involving also languages for which no parallel language resources (e.g. corpora, knowledge resources) are available. Building on an emerging line of research which uses mainly naturally occurring supervision (connections between languages) to guide cross-lingual NLP, we will develop a radically novel approach to lexical acquisition. This approach will transfer lexical knowledge from one language to another as well as will learn it simultaneously for a diverse set of languages using new methodology based on guiding joint learning and inference with rich knowledge about cross-lingual connections. We not only aim to create next generation lexical acquisition technology but also aim to take cross-lingual NLP a big step toward to the direction where it is no longer dependent on parallel resources. We will use our approach to support fundamental tasks and applications aimed at broadening the global reach of NLP to areas where it is now critically needed.
Unfold all
/
Fold all
More information & hyperlinks
Web resources: https://cordis.europa.eu/project/id/648909
Start date: 01-09-2015
End date: 31-08-2021
Total budget - Public funding: 1 989 203,00 Euro - 1 989 203,00 Euro
Cordis data

Original description

Due to the growing volume of textual information available in multiple languages, there is a great demand for Natural Language Processing (NLP) techniques that can automatically process and manage multi-lingual texts, supporting information access and communication in core areas of society (e.g. healthcare, business, science). Many NLP tasks and applications rely on task-specific lexicons (e.g. dictionaries, word classifications) for optimal performance. Recently, automatic acquisition of lexicons from relevant texts has proved a promising, cost-effective alternative to manual lexicography. It has the potential to considerably enhance the viability and portability of NLP technology both within and across languages. However, this approach has been explored for a very small number of resource-rich languages only, leaving the vast majority of worlds’ languages without useful technology. The ambitious goal of this project is to take research in lexical acquisition to the level where it can support multi-lingual NLP, involving also languages for which no parallel language resources (e.g. corpora, knowledge resources) are available. Building on an emerging line of research which uses mainly naturally occurring supervision (connections between languages) to guide cross-lingual NLP, we will develop a radically novel approach to lexical acquisition. This approach will transfer lexical knowledge from one language to another as well as will learn it simultaneously for a diverse set of languages using new methodology based on guiding joint learning and inference with rich knowledge about cross-lingual connections. We not only aim to create next generation lexical acquisition technology but also aim to take cross-lingual NLP a big step toward to the direction where it is no longer dependent on parallel resources. We will use our approach to support fundamental tasks and applications aimed at broadening the global reach of NLP to areas where it is now critically needed.

Status

CLOSED

Call topic

ERC-CoG-2014

Update Date

27-04-2024
Images
No images available.
Geographical location(s)
Structured mapping
Unfold all
/
Fold all
Horizon 2020
H2020-EU.1. EXCELLENT SCIENCE
H2020-EU.1.1. EXCELLENT SCIENCE - European Research Council (ERC)
ERC-2014
ERC-2014-CoG
ERC-CoG-2014 ERC Consolidator Grant