BiTFormer | Biologically Plausible Transformers - Integrating Top-Down and Bottom-Up Signals in the Primary Vision System for Computationally Efficient Deep Learning

Summary
Deep learning (DL) has recently achieved remarkable success due to the continuous growth in model sizes. However, this growth has led to increased energy consumption. Hardware implementation of digital DL can help reduce energy usage, but the Von Neumann architecture of current DL has hindered its practical realization. In contrast, the brain exhibits energy-efficient multiscale spatiotemporal processing. Biologically plausible (BiP) frameworks have emerged as alternatives to mainstream DL. These methods use bottom-up and top-down signals, incorporating feedforward and feedback mechanisms, and local objectives instead of global error. Recently, I demonstrated that a BiP opto-analog hardware can achieve competitive performance compared to digital DL for feedforward networks. However, transformers, the backbone of current DL, are challenging to implement due to the input-dependent quadratic complexity in the transformer's attention. This project leverages the multiscale dynamics in the primary vision system to explore BiP architectures for transformers.

The project is hosted at the University of Tübingen under Matthias Bethge and Thomas Euler, who have a long-standing effort in the system identification of mouse retina via DL. The project has three objectives. First, I will extract top-down information from neural recordings of ganglion cells in the mouse retina, focusing on unique spatiotemporal features that maximally activate specific cell types. Next, I will combine top-down signals with bottom-up models of the retina using recurrent architectures with linear complexity and compare their performance in classification tasks to a vision transformer for the retina. Lastly, I propose a BiP transformer with local weight updates. I will examine the robustness of models under data distribution shifts and noise injection. A positive outcome of the project will address energy and cost issues of AI and help me progress my academic career in this interdisciplinary field.
Unfold all
/
Fold all
More information & hyperlinks
Web resources: https://cordis.europa.eu/project/id/101151549
Start date: 01-10-2024
End date: 30-09-2026
Total budget - Public funding: - 173 847,00 Euro
Cordis data

Original description

Deep learning (DL) has recently achieved remarkable success due to the continuous growth in model sizes. However, this growth has led to increased energy consumption. Hardware implementation of digital DL can help reduce energy usage, but the Von Neumann architecture of current DL has hindered its practical realization. In contrast, the brain exhibits energy-efficient multiscale spatiotemporal processing. Biologically plausible (BiP) frameworks have emerged as alternatives to mainstream DL. These methods use bottom-up and top-down signals, incorporating feedforward and feedback mechanisms, and local objectives instead of global error. Recently, I demonstrated that a BiP opto-analog hardware can achieve competitive performance compared to digital DL for feedforward networks. However, transformers, the backbone of current DL, are challenging to implement due to the input-dependent quadratic complexity in the transformer's attention. This project leverages the multiscale dynamics in the primary vision system to explore BiP architectures for transformers.

The project is hosted at the University of Tübingen under Matthias Bethge and Thomas Euler, who have a long-standing effort in the system identification of mouse retina via DL. The project has three objectives. First, I will extract top-down information from neural recordings of ganglion cells in the mouse retina, focusing on unique spatiotemporal features that maximally activate specific cell types. Next, I will combine top-down signals with bottom-up models of the retina using recurrent architectures with linear complexity and compare their performance in classification tasks to a vision transformer for the retina. Lastly, I propose a BiP transformer with local weight updates. I will examine the robustness of models under data distribution shifts and noise injection. A positive outcome of the project will address energy and cost issues of AI and help me progress my academic career in this interdisciplinary field.

Status

SIGNED

Call topic

HORIZON-MSCA-2023-PF-01-01

Update Date

22-11-2024
Images
No images available.
Geographical location(s)
Structured mapping
Unfold all
/
Fold all
Horizon Europe
HORIZON.1 Excellent Science
HORIZON.1.2 Marie Skłodowska-Curie Actions (MSCA)
HORIZON.1.2.0 Cross-cutting call topics
HORIZON-MSCA-2023-PF-01
HORIZON-MSCA-2023-PF-01-01 MSCA Postdoctoral Fellowships 2023