NPSL | Building a Protein Sequence Library using Nanopores

Summary
Nanopore is a breakthrough genomic sequencing tool with long read length, high accuracy, and high throughput in reading DNA. However, it is the proteome that ultimately determines the cell’s phenotype, and tremendous efforts have been made to develop sequencing techniques for reading proteins in the past decade. First breakthroughs have been appearing for sequencing short peptides at the single-molecule level. Current strategies are however still far from sequencing full-length native proteins due to the low synthesis efficiencies in protein handling and the limited scanning length (can only read short peptides with ~25 amino acids). Here, I will first develop a novel method to synthesize a large variety of proteins that are connected to DNA with their own codon sequences, using just a few reactions with a puromycin linker labeled to a mRNA, and followed by in vitro translation, reverse transcription, and RNase cleavage. Furthermore, I propose a single-molecule protein nanopore engineering strategy that significantly extends the MspA nanopore lumen length to push the limit for protein reading length. With the proven abilities of a Hel308 DNA helicase, the engineered nanopore can read the entire DNA codons first, followed by the read of >100 amino acids of the protein. Upon recording large data sets, I will build a protein signal library with the related codon information for the training of a machine learning model, which serves as a tool for de novo protein sequencing. This work has great potential to push the limits of sequencing technology to reading the whole proteome.
Unfold all
/
Fold all
More information & hyperlinks
Web resources: https://cordis.europa.eu/project/id/101151821
Start date: 01-12-2024
End date: 30-11-2026
Total budget - Public funding: - 203 464,00 Euro
Cordis data

Original description

Nanopore is a breakthrough genomic sequencing tool with long read length, high accuracy, and high throughput in reading DNA. However, it is the proteome that ultimately determines the cell’s phenotype, and tremendous efforts have been made to develop sequencing techniques for reading proteins in the past decade. First breakthroughs have been appearing for sequencing short peptides at the single-molecule level. Current strategies are however still far from sequencing full-length native proteins due to the low synthesis efficiencies in protein handling and the limited scanning length (can only read short peptides with ~25 amino acids). Here, I will first develop a novel method to synthesize a large variety of proteins that are connected to DNA with their own codon sequences, using just a few reactions with a puromycin linker labeled to a mRNA, and followed by in vitro translation, reverse transcription, and RNase cleavage. Furthermore, I propose a single-molecule protein nanopore engineering strategy that significantly extends the MspA nanopore lumen length to push the limit for protein reading length. With the proven abilities of a Hel308 DNA helicase, the engineered nanopore can read the entire DNA codons first, followed by the read of >100 amino acids of the protein. Upon recording large data sets, I will build a protein signal library with the related codon information for the training of a machine learning model, which serves as a tool for de novo protein sequencing. This work has great potential to push the limits of sequencing technology to reading the whole proteome.

Status

SIGNED

Call topic

HORIZON-MSCA-2023-PF-01-01

Update Date

21-11-2024
Images
No images available.
Geographical location(s)
Structured mapping
Unfold all
/
Fold all
Horizon Europe
HORIZON.1 Excellent Science
HORIZON.1.2 Marie Skłodowska-Curie Actions (MSCA)
HORIZON.1.2.0 Cross-cutting call topics
HORIZON-MSCA-2023-PF-01
HORIZON-MSCA-2023-PF-01-01 MSCA Postdoctoral Fellowships 2023