SEBAMAT | Semantics-Based Machine Translation

Summary
"Most current machine translation systems are either rule-based or corpus-based. They typically take the semantics of a text only in so far into account as they are implicit in the underlying text corpora or dictionaries. This is also true for the recent neural machine translation systems, which - in comparison to standard phrase-based systems, tend to have the focus even more on fluency rather than adequacy. However, it has been pointed out that it is unlikely to be able to bring machine translation quality to the next level as long as the systems do not make better use of semantic knowledge. For example, according to Kevin Knight future machine translation systems should use information of the type ""who is doing what to whom and when"", i.e. involving the identification of the semantic roles of the items occurring in a sentence. To move forward in this direction, we propose to implement and evaluate three different approaches: The first approach is based on state of the art machine translation but considers word senses rather than words. That is, a word sense disambiguation system is used to identify the word senses in large parallel text corpora. Then, in analogy to standard word alignment, the word senses are aligned across languages, and the resulting multilingual sense dictionaries are used in conjunction with the word sense disambiguation systems for translating new texts. Our second approach uses role labeling for identifying the semantic roles of the words in a sentence. The roles are aligned across languages, and this information is then used to improve the translation process. The third approach is based on an algorithm which computes the semantic similarity between phrases. It considers the translation task as finding semantically similar phrases across languages.
"
Unfold all
/
Fold all
More information & hyperlinks
Web resources: https://cordis.europa.eu/project/id/844951
Start date: 01-04-2020
End date: 31-03-2022
Total budget - Public funding: 165 085,44 Euro - 165 085,00 Euro
Cordis data

Original description

"Most current machine translation systems are either rule-based or corpus-based. They typically take the semantics of a text only in so far into account as they are implicit in the underlying text corpora or dictionaries. This is also true for the recent neural machine translation systems, which - in comparison to standard phrase-based systems, tend to have the focus even more on fluency rather than adequacy. However, it has been pointed out that it is unlikely to be able to bring machine translation quality to the next level as long as the systems do not make better use of semantic knowledge. For example, according to Kevin Knight future machine translation systems should use information of the type ""who is doing what to whom and when"", i.e. involving the identification of the semantic roles of the items occurring in a sentence. To move forward in this direction, we propose to implement and evaluate three different approaches: The first approach is based on state of the art machine translation but considers word senses rather than words. That is, a word sense disambiguation system is used to identify the word senses in large parallel text corpora. Then, in analogy to standard word alignment, the word senses are aligned across languages, and the resulting multilingual sense dictionaries are used in conjunction with the word sense disambiguation systems for translating new texts. Our second approach uses role labeling for identifying the semantic roles of the words in a sentence. The roles are aligned across languages, and this information is then used to improve the translation process. The third approach is based on an algorithm which computes the semantic similarity between phrases. It considers the translation task as finding semantically similar phrases across languages.
"

Status

CLOSED

Call topic

MSCA-IF-2018

Update Date

28-04-2024
Images
No images available.
Geographical location(s)
Structured mapping
Unfold all
/
Fold all
Horizon 2020
H2020-EU.1. EXCELLENT SCIENCE
H2020-EU.1.3. EXCELLENT SCIENCE - Marie Skłodowska-Curie Actions (MSCA)
H2020-EU.1.3.2. Nurturing excellence by means of cross-border and cross-sector mobility
H2020-MSCA-IF-2018
MSCA-IF-2018