Summary
Machine learning has rapidly evolved in the last decade, significantly improving accuracy on tasks such as image classification. Much of this success can be attributed to the re-emergence of neural nets. However, learning algorithms are still far from achieving the capabilities of human cognition. In particular, humans can rapidly organize an input stream (e.g., textual or visual) into a set of entities, and understand the complex relations between those. In this project I aim to create a general methodology for semantic interpretation of input streams. Such problems fall under the structured-prediction framework, to which I have made numerous contributions. The proposal identifies and addresses three key components required for a comprehensive and empirically effective approach to the problem.
First, we consider the holistic nature of semantic interpretations, where a top-down process chooses a coherent interpretation among the vast number of options. We argue that deep-learning architectures are ideally suited for modeling such coherence scores, and propose to develop the corresponding theory and algorithms. Second, we address the complexity of the semantic representation, where a stream is mapped into a variable number of entities, each having multiple attributes and relations to other entities. We characterize the properties a model should satisfy in order to produce such interpretations, and propose novel models that achieve this. Third, we develop a theory for understanding when such models can be learned efficiently, and how well they can generalize. To achieve this, we address key questions of non-convex optimization, inductive bias and generalization. We expect these contributions to have a dramatic impact on AI systems, from machine reading of text to image analysis. More broadly, they will help bridge the gap between machine learning as an engineering field, and the study of human cognition.
First, we consider the holistic nature of semantic interpretations, where a top-down process chooses a coherent interpretation among the vast number of options. We argue that deep-learning architectures are ideally suited for modeling such coherence scores, and propose to develop the corresponding theory and algorithms. Second, we address the complexity of the semantic representation, where a stream is mapped into a variable number of entities, each having multiple attributes and relations to other entities. We characterize the properties a model should satisfy in order to produce such interpretations, and propose novel models that achieve this. Third, we develop a theory for understanding when such models can be learned efficiently, and how well they can generalize. To achieve this, we address key questions of non-convex optimization, inductive bias and generalization. We expect these contributions to have a dramatic impact on AI systems, from machine reading of text to image analysis. More broadly, they will help bridge the gap between machine learning as an engineering field, and the study of human cognition.
Unfold all
/
Fold all
More information & hyperlinks
Web resources: | https://cordis.europa.eu/project/id/819080 |
Start date: | 01-02-2019 |
End date: | 31-01-2025 |
Total budget - Public funding: | 1 932 500,00 Euro - 1 932 500,00 Euro |
Cordis data
Original description
Machine learning has rapidly evolved in the last decade, significantly improving accuracy on tasks such as image classification. Much of this success can be attributed to the re-emergence of neural nets. However, learning algorithms are still far from achieving the capabilities of human cognition. In particular, humans can rapidly organize an input stream (e.g., textual or visual) into a set of entities, and understand the complex relations between those. In this project I aim to create a general methodology for semantic interpretation of input streams. Such problems fall under the structured-prediction framework, to which I have made numerous contributions. The proposal identifies and addresses three key components required for a comprehensive and empirically effective approach to the problem.First, we consider the holistic nature of semantic interpretations, where a top-down process chooses a coherent interpretation among the vast number of options. We argue that deep-learning architectures are ideally suited for modeling such coherence scores, and propose to develop the corresponding theory and algorithms. Second, we address the complexity of the semantic representation, where a stream is mapped into a variable number of entities, each having multiple attributes and relations to other entities. We characterize the properties a model should satisfy in order to produce such interpretations, and propose novel models that achieve this. Third, we develop a theory for understanding when such models can be learned efficiently, and how well they can generalize. To achieve this, we address key questions of non-convex optimization, inductive bias and generalization. We expect these contributions to have a dramatic impact on AI systems, from machine reading of text to image analysis. More broadly, they will help bridge the gap between machine learning as an engineering field, and the study of human cognition.
Status
SIGNEDCall topic
ERC-2018-COGUpdate Date
27-04-2024
Images
No images available.
Geographical location(s)