Summary
The exponential growth of the Web is resulting in vast amounts of online content. However, the
information expressed therein is not at easy reach: what we typically browse is only an infinitesimal part
of the Web. And even if we had time to read all the Web we could not understand it, as most of it is
written in languages we do not speak. Computers, instead, have the power to process the entire Web.
But, in order to ”read” it, that is perform machine reading, they still have to face the hard problem of
Natural Language Understanding, i.e., automatically making sense of human language. To tackle this
long-lasting challenge in Natural Language Processing (NLP), the task of semantic parsing has recently
gained popularity. This aims at creating structured representations of meaning for an input text. However,
current semantic parsers require supervision, binding them to the language of interest and hindering their
extension to multiple languages.
Here we propose a research program to investigate radically new directions for enabling multilingual
semantic parsing, without the heavy requirement of annotating training data for each new language.
The key intuitions of our proposal are treating multilinguality as a resource rather than an obstacle and
embracing the knowledge-based paradigm which allows supervision in the machine learning sense to be
replaced with efficacious use of lexical knowledge resources. In stage 1 of the project we will acquire
a huge network of language-independent, structured semantic representations of sentences. In stage 2,
we will leverage this resource to develop innovative algorithms that perform semantic parsing in any
language. These two stages are mutually beneficial, progressively enriching less-resourced languages and
contributing towards leveling the playing field for all languages. Enabling Natural Language Understanding
across languages should have an impact on NLP and other areas of AI, plus a societal impact on language
learners.
information expressed therein is not at easy reach: what we typically browse is only an infinitesimal part
of the Web. And even if we had time to read all the Web we could not understand it, as most of it is
written in languages we do not speak. Computers, instead, have the power to process the entire Web.
But, in order to ”read” it, that is perform machine reading, they still have to face the hard problem of
Natural Language Understanding, i.e., automatically making sense of human language. To tackle this
long-lasting challenge in Natural Language Processing (NLP), the task of semantic parsing has recently
gained popularity. This aims at creating structured representations of meaning for an input text. However,
current semantic parsers require supervision, binding them to the language of interest and hindering their
extension to multiple languages.
Here we propose a research program to investigate radically new directions for enabling multilingual
semantic parsing, without the heavy requirement of annotating training data for each new language.
The key intuitions of our proposal are treating multilinguality as a resource rather than an obstacle and
embracing the knowledge-based paradigm which allows supervision in the machine learning sense to be
replaced with efficacious use of lexical knowledge resources. In stage 1 of the project we will acquire
a huge network of language-independent, structured semantic representations of sentences. In stage 2,
we will leverage this resource to develop innovative algorithms that perform semantic parsing in any
language. These two stages are mutually beneficial, progressively enriching less-resourced languages and
contributing towards leveling the playing field for all languages. Enabling Natural Language Understanding
across languages should have an impact on NLP and other areas of AI, plus a societal impact on language
learners.
Unfold all
/
Fold all
More information & hyperlinks
Web resources: | https://cordis.europa.eu/project/id/726487 |
Start date: | 01-06-2017 |
End date: | 31-05-2023 |
Total budget - Public funding: | 1 497 250,00 Euro - 1 497 250,00 Euro |
Cordis data
Original description
The exponential growth of the Web is resulting in vast amounts of online content. However, theinformation expressed therein is not at easy reach: what we typically browse is only an infinitesimal part
of the Web. And even if we had time to read all the Web we could not understand it, as most of it is
written in languages we do not speak. Computers, instead, have the power to process the entire Web.
But, in order to ”read” it, that is perform machine reading, they still have to face the hard problem of
Natural Language Understanding, i.e., automatically making sense of human language. To tackle this
long-lasting challenge in Natural Language Processing (NLP), the task of semantic parsing has recently
gained popularity. This aims at creating structured representations of meaning for an input text. However,
current semantic parsers require supervision, binding them to the language of interest and hindering their
extension to multiple languages.
Here we propose a research program to investigate radically new directions for enabling multilingual
semantic parsing, without the heavy requirement of annotating training data for each new language.
The key intuitions of our proposal are treating multilinguality as a resource rather than an obstacle and
embracing the knowledge-based paradigm which allows supervision in the machine learning sense to be
replaced with efficacious use of lexical knowledge resources. In stage 1 of the project we will acquire
a huge network of language-independent, structured semantic representations of sentences. In stage 2,
we will leverage this resource to develop innovative algorithms that perform semantic parsing in any
language. These two stages are mutually beneficial, progressively enriching less-resourced languages and
contributing towards leveling the playing field for all languages. Enabling Natural Language Understanding
across languages should have an impact on NLP and other areas of AI, plus a societal impact on language
learners.
Status
CLOSEDCall topic
ERC-2016-COGUpdate Date
27-04-2024
Images
No images available.
Geographical location(s)