SPARSE-ML | Cascade Processes for Sparse Machine Learning

Summary
Deep learning continues to achieve impressive breakthroughs across disciplines and is a major driving force behind a multitude of industry innovations. Most of its successes are achieved by increasingly large neural networks that are trained on massive data sets. Their development inflicts costs that are only affordable by a few labs and prevent global participation in the creation of related technologies. The huge model sizes also pose computational challenges for algorithms that aim to address issues with features that are critical in real-world applications like fairness, adversarial robustness, and interpretability. The high demand of neural networks for vast amounts of data further limits their utility for solving highly relevant tasks in biomedicine, economics, or natural sciences.
To democratize deep learning and to broaden its applicability, we have to find ways to learn small-scale models. With this end in view, we will promote sparsity at multiple stages of the machine learning pipeline and identify models that are scaleable, resource- and data-efficient, robust to noise, and provide insights into problems. To achieve this, we need to overcome two challenges: the identification of trainable sparse network structures and the de novo optimization of small-scale models.
The solutions that we propose combine ideas from statistical physics, complex network science, and machine learning. Our fundamental innovations rely on the insight that neural networks are a member of a cascade model class that we made analytically tractable on random graphs. Advancing our derivations will enable us to develop novel parameter initialization, regularization, and reparameterization methods that will compensate for the missing implicit benefits of overparameterization for learning. The significant reduction in model size achieved by our methods will help unlock the full potential of deep learning to serve society as a whole.
Results, demos, etc. Show all and search (0)
Unfold all
/
Fold all
More information & hyperlinks
Web resources: https://cordis.europa.eu/project/id/101116395
Start date: 01-12-2023
End date: 30-11-2028
Total budget - Public funding: 1 499 285,00 Euro - 1 499 285,00 Euro
Cordis data

Original description

Deep learning continues to achieve impressive breakthroughs across disciplines and is a major driving force behind a multitude of industry innovations. Most of its successes are achieved by increasingly large neural networks that are trained on massive data sets. Their development inflicts costs that are only affordable by a few labs and prevent global participation in the creation of related technologies. The huge model sizes also pose computational challenges for algorithms that aim to address issues with features that are critical in real-world applications like fairness, adversarial robustness, and interpretability. The high demand of neural networks for vast amounts of data further limits their utility for solving highly relevant tasks in biomedicine, economics, or natural sciences.
To democratize deep learning and to broaden its applicability, we have to find ways to learn small-scale models. With this end in view, we will promote sparsity at multiple stages of the machine learning pipeline and identify models that are scaleable, resource- and data-efficient, robust to noise, and provide insights into problems. To achieve this, we need to overcome two challenges: the identification of trainable sparse network structures and the de novo optimization of small-scale models.
The solutions that we propose combine ideas from statistical physics, complex network science, and machine learning. Our fundamental innovations rely on the insight that neural networks are a member of a cascade model class that we made analytically tractable on random graphs. Advancing our derivations will enable us to develop novel parameter initialization, regularization, and reparameterization methods that will compensate for the missing implicit benefits of overparameterization for learning. The significant reduction in model size achieved by our methods will help unlock the full potential of deep learning to serve society as a whole.

Status

SIGNED

Call topic

ERC-2023-STG

Update Date

12-03-2024
Images
No images available.
Geographical location(s)