Summary
The next wave of smart applications in our society will need embedded devices (robots, wearables, etc.) with increased intelligence at much reduced energy and latency cost. Compared to current embedded platforms, up to 1000x efficiency gains could be achieved through tight processor-algorithm co-optimization. However, due to the slow development cycle of processor chips (many months to years) in comparison to algorithms (hours to weeks), this co-optimization today merely boils down to selecting algorithms which run well on mature, available hardware. As these processors and their tooling have been optimized for mature algorithms, not the inherently best algorithm “wins”, but the one that happens to best fit the available “old-school” hardware platforms. This “hardware lottery” holds back innovation, severely impacts embedded AI execution efficiency, and narrows the market to a few large companies.
The BINGO vision to break this innovation deadlock is to enable heterogeneous compute platform customization for a given AI workload in a matter of days (100x faster), through rapid selection and assembly of prefabricated co-processor chiplets. This needs breakthroughs in:
a.) A library of embedded-AI-optimized co-processor chiplets, surpassing the SotA in terms of dataflow heterogeneity for improved efficiency (100x over CPU); and inter-operability in heterogeneous chiplet meshes on a reusable “breadboard” interposer.
b.) Rapid cost models and workload schedulers for beyond-SotA heterogeneous platform customization: automatically deriving the optimal chiplet combination for an application, assemble it and deploy, all in a few days.
Optimizing across the disciplines of chip design, computer architecture, scheduling, and AI fits perfectly to my expertise gained at KU Leuven, imec and Intel. It will stimulate a surge of embedded AI innovations, enable efficient execution of new algorithms, and bring the EU back at the forefront of chip design and embedded AI research.
The BINGO vision to break this innovation deadlock is to enable heterogeneous compute platform customization for a given AI workload in a matter of days (100x faster), through rapid selection and assembly of prefabricated co-processor chiplets. This needs breakthroughs in:
a.) A library of embedded-AI-optimized co-processor chiplets, surpassing the SotA in terms of dataflow heterogeneity for improved efficiency (100x over CPU); and inter-operability in heterogeneous chiplet meshes on a reusable “breadboard” interposer.
b.) Rapid cost models and workload schedulers for beyond-SotA heterogeneous platform customization: automatically deriving the optimal chiplet combination for an application, assemble it and deploy, all in a few days.
Optimizing across the disciplines of chip design, computer architecture, scheduling, and AI fits perfectly to my expertise gained at KU Leuven, imec and Intel. It will stimulate a surge of embedded AI innovations, enable efficient execution of new algorithms, and bring the EU back at the forefront of chip design and embedded AI research.
Unfold all
/
Fold all
More information & hyperlinks
Web resources: | https://cordis.europa.eu/project/id/101088865 |
Start date: | 01-06-2023 |
End date: | 31-05-2028 |
Total budget - Public funding: | 1 995 750,00 Euro - 1 995 750,00 Euro |
Cordis data
Original description
The next wave of smart applications in our society will need embedded devices (robots, wearables, etc.) with increased intelligence at much reduced energy and latency cost. Compared to current embedded platforms, up to 1000x efficiency gains could be achieved through tight processor-algorithm co-optimization. However, due to the slow development cycle of processor chips (many months to years) in comparison to algorithms (hours to weeks), this co-optimization today merely boils down to selecting algorithms which run well on mature, available hardware. As these processors and their tooling have been optimized for mature algorithms, not the inherently best algorithm “wins”, but the one that happens to best fit the available “old-school” hardware platforms. This “hardware lottery” holds back innovation, severely impacts embedded AI execution efficiency, and narrows the market to a few large companies.The BINGO vision to break this innovation deadlock is to enable heterogeneous compute platform customization for a given AI workload in a matter of days (100x faster), through rapid selection and assembly of prefabricated co-processor chiplets. This needs breakthroughs in:
a.) A library of embedded-AI-optimized co-processor chiplets, surpassing the SotA in terms of dataflow heterogeneity for improved efficiency (100x over CPU); and inter-operability in heterogeneous chiplet meshes on a reusable “breadboard” interposer.
b.) Rapid cost models and workload schedulers for beyond-SotA heterogeneous platform customization: automatically deriving the optimal chiplet combination for an application, assemble it and deploy, all in a few days.
Optimizing across the disciplines of chip design, computer architecture, scheduling, and AI fits perfectly to my expertise gained at KU Leuven, imec and Intel. It will stimulate a surge of embedded AI innovations, enable efficient execution of new algorithms, and bring the EU back at the forefront of chip design and embedded AI research.
Status
SIGNEDCall topic
ERC-2022-COGUpdate Date
31-07-2023
Images
No images available.
Geographical location(s)