ERMADA | Illuminating Earth’s microbial diversity and origins from metagenomes with deep learning

Summary
The estimated number of microbes on our planet outnumbers the stars of the Milky Way galaxy and their biomass exceeds that of all plants and animals. Out of the 10^12 microbial species, only around 10^4 have been cultured, less than 10^5 species are represented by classified sequences, and a staggering estimated 99% of these microorganisms remain taxonomically unknown. Metagenomic shotgun sequencing has emerged as the most prevalent way of studying and classifying microorganisms from various habitats whereas genome analysis can be used to uncover the functions of genes, enzymes and metabolic pathways in a microbial community. This painstaking effort is crucial to understanding Earth's biodiversity, as microbes play important roles in regulating the planet’s biogeochemical cycles through processes that govern nutrient circulation in both terrestrial and marine environments. In this proposal, we will employ cutting edge bioinformatics and machine learning algorithms to analyze and elucidate Earth’s microbial diversity. We will use deep neural networks trained by large volumes of metagenomic sequences as well as big data methods to process hundreds of terabytes of data and taxonomically classify all uncharacterized metagenomic samples, by identifying their origins and habitats. Going beyond the capacities of conventional sequence similarity and comparison analyses, neural network models can capture higher level, abstract defining features and patterns in metagenomic sequences. The aim of this study is twofold: i) to gain a deeper understanding of the composition and structure of the microbiome at different rank levels and lineages and ii) to provide a complete record of the planet’s present microbial diversity footprint. The latter can serve as a reference dataset for future studies pertaining to microbiome evolution due to climate change or other long-term environmental factors.
Unfold all
/
Fold all
More information & hyperlinks
Web resources: https://cordis.europa.eu/project/id/838018
Start date: 01-08-2019
End date: 03-01-2025
Total budget - Public funding: 247 628,16 Euro - 247 628,00 Euro
Cordis data

Original description

The estimated number of microbes on our planet outnumbers the stars of the Milky Way galaxy and their biomass exceeds that of all plants and animals. Out of the 10^12 microbial species, only around 10^4 have been cultured, less than 10^5 species are represented by classified sequences, and a staggering estimated 99% of these microorganisms remain taxonomically unknown. Metagenomic shotgun sequencing has emerged as the most prevalent way of studying and classifying microorganisms from various habitats whereas genome analysis can be used to uncover the functions of genes, enzymes and metabolic pathways in a microbial community. This painstaking effort is crucial to understanding Earth's biodiversity, as microbes play important roles in regulating the planet’s biogeochemical cycles through processes that govern nutrient circulation in both terrestrial and marine environments. In this proposal, we will employ cutting edge bioinformatics and machine learning algorithms to analyze and elucidate Earth’s microbial diversity. We will use deep neural networks trained by large volumes of metagenomic sequences as well as big data methods to process hundreds of terabytes of data and taxonomically classify all uncharacterized metagenomic samples, by identifying their origins and habitats. Going beyond the capacities of conventional sequence similarity and comparison analyses, neural network models can capture higher level, abstract defining features and patterns in metagenomic sequences. The aim of this study is twofold: i) to gain a deeper understanding of the composition and structure of the microbiome at different rank levels and lineages and ii) to provide a complete record of the planet’s present microbial diversity footprint. The latter can serve as a reference dataset for future studies pertaining to microbiome evolution due to climate change or other long-term environmental factors.

Status

SIGNED

Call topic

MSCA-IF-2018

Update Date

28-04-2024
Images
No images available.
Geographical location(s)
Structured mapping
Unfold all
/
Fold all
Horizon 2020
H2020-EU.1. EXCELLENT SCIENCE
H2020-EU.1.3. EXCELLENT SCIENCE - Marie Skłodowska-Curie Actions (MSCA)
H2020-EU.1.3.2. Nurturing excellence by means of cross-border and cross-sector mobility
H2020-MSCA-IF-2018
MSCA-IF-2018