FACTORY | New paradigms for latent factor estimation

Summary
Data is often available in matrix form, in which columns are samples, and processing of such data often entails finding an approximate factorisation of the matrix in two factors. The first factor yields recurring patterns characteristic of the data. The second factor describes in which proportions each data sample is made of these patterns. Latent factor estimation (LFE) is the problem of finding such a factorisation, usually under given constraints. LFE appears under other domain-specific names such as dictionary learning, low-rank approximation, factor analysis or latent semantic analysis. It is used for tasks such as dimensionality reduction, unmixing, soft clustering, coding or matrix completion in very diverse fields.

In this project, I propose to explore three new paradigms that push the frontiers of traditional LFE. First, I want to break beyond the ubiquitous Gaussian assumption, a practical choice that too rarely complies with the nature and geometry of the data. Estimation in non-Gaussian models is more difficult, but recent work in audio and text processing has shown that it pays off in practice. Second, in traditional settings the data matrix is often a collection of features computed from raw data. These features are computed with generic off-the-shelf transforms that loosely preprocess the data, setting a limit to performance. I propose a new paradigm in which an optimal low-rank inducing transform is learnt together with the factors in a single step. Thirdly, I show that the dominant deterministic approach to LFE should be reconsidered and I propose a novel statistical estimation paradigm, based on the marginal likelihood, with enhanced capabilities. The new methodology is applied to real-world problems with societal impact in audio signal processing (speech enhancement, music remastering), remote sensing (Earth observation, cosmic object discovery) and data mining (multimodal information retrieval, user recommendation).
Unfold all
/
Fold all
More information & hyperlinks
Web resources: https://cordis.europa.eu/project/id/681839
Start date: 01-09-2016
End date: 31-08-2022
Total budget - Public funding: 1 931 776,25 Euro - 1 931 776,00 Euro
Cordis data

Original description

Data is often available in matrix form, in which columns are samples, and processing of such data often entails finding an approximate factorisation of the matrix in two factors. The first factor yields recurring patterns characteristic of the data. The second factor describes in which proportions each data sample is made of these patterns. Latent factor estimation (LFE) is the problem of finding such a factorisation, usually under given constraints. LFE appears under other domain-specific names such as dictionary learning, low-rank approximation, factor analysis or latent semantic analysis. It is used for tasks such as dimensionality reduction, unmixing, soft clustering, coding or matrix completion in very diverse fields.

In this project, I propose to explore three new paradigms that push the frontiers of traditional LFE. First, I want to break beyond the ubiquitous Gaussian assumption, a practical choice that too rarely complies with the nature and geometry of the data. Estimation in non-Gaussian models is more difficult, but recent work in audio and text processing has shown that it pays off in practice. Second, in traditional settings the data matrix is often a collection of features computed from raw data. These features are computed with generic off-the-shelf transforms that loosely preprocess the data, setting a limit to performance. I propose a new paradigm in which an optimal low-rank inducing transform is learnt together with the factors in a single step. Thirdly, I show that the dominant deterministic approach to LFE should be reconsidered and I propose a novel statistical estimation paradigm, based on the marginal likelihood, with enhanced capabilities. The new methodology is applied to real-world problems with societal impact in audio signal processing (speech enhancement, music remastering), remote sensing (Earth observation, cosmic object discovery) and data mining (multimodal information retrieval, user recommendation).

Status

CLOSED

Call topic

ERC-CoG-2015

Update Date

27-04-2024
Images
No images available.
Geographical location(s)
Structured mapping
Unfold all
/
Fold all
Horizon 2020
H2020-EU.1. EXCELLENT SCIENCE
H2020-EU.1.1. EXCELLENT SCIENCE - European Research Council (ERC)
ERC-2015
ERC-2015-CoG
ERC-CoG-2015 ERC Consolidator Grant