DAPP | Data-centric Parallel Programming

Summary
We address a fundamental and increasingly important challenge in computer science: how to program large-scale heterogeneous parallel computers. Society relies on these computers to satisfy the growing demands of important applications such as drug design, weather prediction, and big data analytics. Architectural trends make heterogeneous parallel processors the fundamental building blocks of computing platforms ranging from quad-core laptops to million-core supercomputers; failing to exploit these architectures efficiently will severely limit the technological advance of our society. Computationally demanding problems are often inherently parallel and can readily be compiled for various target architectures. Yet, efficiently mapping data to the target memory system is notoriously hard, and the cost of fetching two operands from remote memory is already orders of magnitude more expensive than any arithmetic operation. Data access cost is growing with the amount of parallelism which makes data layout optimizations crucial. Prevalent parallel programming abstractions largely ignore data access and guide programmers to design threads of execution that are scheduled to the machine. We depart from this control-centric model to a data-centric program formulation where we express programs as collections of values, called memlets, that are mapped as first-class objects by the compiler and runtime system. Our holistic compiler and runtime system aims to substantially advance the state of the art in parallel computing by combining static and dynamic scheduling of memlets to complex heterogeneous target architectures. We will demonstrate our methods on three challenging real-world applications in scientific computing, data analytics, and graph processing. We strongly believe that, without holistic data-centric programming, the growing complexity and inefficiency of parallel programming will create a scaling wall that will limit our future computational capabilities.
Unfold all
/
Fold all
More information & hyperlinks
Web resources: https://cordis.europa.eu/project/id/678880
Start date: 01-06-2016
End date: 30-09-2021
Total budget - Public funding: 1 499 672,00 Euro - 1 499 672,00 Euro
Cordis data

Original description

We address a fundamental and increasingly important challenge in computer science: how to program large-scale heterogeneous parallel computers. Society relies on these computers to satisfy the growing demands of important applications such as drug design, weather prediction, and big data analytics. Architectural trends make heterogeneous parallel processors the fundamental building blocks of computing platforms ranging from quad-core laptops to million-core supercomputers; failing to exploit these architectures efficiently will severely limit the technological advance of our society. Computationally demanding problems are often inherently parallel and can readily be compiled for various target architectures. Yet, efficiently mapping data to the target memory system is notoriously hard, and the cost of fetching two operands from remote memory is already orders of magnitude more expensive than any arithmetic operation. Data access cost is growing with the amount of parallelism which makes data layout optimizations crucial. Prevalent parallel programming abstractions largely ignore data access and guide programmers to design threads of execution that are scheduled to the machine. We depart from this control-centric model to a data-centric program formulation where we express programs as collections of values, called memlets, that are mapped as first-class objects by the compiler and runtime system. Our holistic compiler and runtime system aims to substantially advance the state of the art in parallel computing by combining static and dynamic scheduling of memlets to complex heterogeneous target architectures. We will demonstrate our methods on three challenging real-world applications in scientific computing, data analytics, and graph processing. We strongly believe that, without holistic data-centric programming, the growing complexity and inefficiency of parallel programming will create a scaling wall that will limit our future computational capabilities.

Status

CLOSED

Call topic

ERC-StG-2015

Update Date

27-04-2024
Images
No images available.
Geographical location(s)
Structured mapping
Unfold all
/
Fold all
Horizon 2020
H2020-EU.1. EXCELLENT SCIENCE
H2020-EU.1.1. EXCELLENT SCIENCE - European Research Council (ERC)
ERC-2015
ERC-2015-STG
ERC-StG-2015 ERC Starting Grant