SSBD | Small Summaries for Big Data

Summary
A fundamental challenge in processing the massive quantities of information generated by modern applications is in extracting suitable representations of the data that can be stored, manipulated and interrogated on a single machine. A promising approach is in the design and analysis of compact summaries: data structures which capture key features of the data, and which can be created effectively over distributed data sets. Popular summary structures include the Bloom filter, which compactly represents a set of items, and sketches which allow vector norms and products to be estimated. These are very attractive, since they can be computed in parallel and combined to yield a single, compact summary of the data. Yet the full potential of summaries is far from being fully realized.

The Principal Investigator will lead a team, working on important problems around creating Small Summaries for Big Data. The goal is to substantially advance the state of the art in data summarization, to the point where accurate and effective summaries are available for a wide array of problems, and can be used seamlessly in applications that process big data. Several directions will be pursued, including: designing and evaluating new summaries for fundamental computations such as tracking the data distribution; summary techniques for complex structures, such as massive matrices, massive graphs, and beyond; and summaries that allow the verification of outsourced computation over big data. Success in any one of these areas could lead to substantial impact on practice, as evidenced by the influence of existing summary
techniques.

Support in the form of a five-year research grant will allow the PI to consolidate his research in this area, and build an expert team to focus on these challenging algorithmic questions.
Unfold all
/
Fold all
More information & hyperlinks
Web resources: https://cordis.europa.eu/project/id/647557
Start date: 01-05-2015
End date: 30-04-2021
Total budget - Public funding: 1 565 502,00 Euro - 1 565 502,00 Euro
Cordis data

Original description

A fundamental challenge in processing the massive quantities of information generated by modern applications is in extracting suitable representations of the data that can be stored, manipulated and interrogated on a single machine. A promising approach is in the design and analysis of compact summaries: data structures which capture key features of the data, and which can be created effectively over distributed data sets. Popular summary structures include the Bloom filter, which compactly represents a set of items, and sketches which allow vector norms and products to be estimated. These are very attractive, since they can be computed in parallel and combined to yield a single, compact summary of the data. Yet the full potential of summaries is far from being fully realized.

The Principal Investigator will lead a team, working on important problems around creating Small Summaries for Big Data. The goal is to substantially advance the state of the art in data summarization, to the point where accurate and effective summaries are available for a wide array of problems, and can be used seamlessly in applications that process big data. Several directions will be pursued, including: designing and evaluating new summaries for fundamental computations such as tracking the data distribution; summary techniques for complex structures, such as massive matrices, massive graphs, and beyond; and summaries that allow the verification of outsourced computation over big data. Success in any one of these areas could lead to substantial impact on practice, as evidenced by the influence of existing summary
techniques.

Support in the form of a five-year research grant will allow the PI to consolidate his research in this area, and build an expert team to focus on these challenging algorithmic questions.

Status

CLOSED

Call topic

ERC-CoG-2014

Update Date

27-04-2024
Images
No images available.
Geographical location(s)
Structured mapping
Unfold all
/
Fold all
Horizon 2020
H2020-EU.1. EXCELLENT SCIENCE
H2020-EU.1.1. EXCELLENT SCIENCE - European Research Council (ERC)
ERC-2014
ERC-2014-CoG
ERC-CoG-2014 ERC Consolidator Grant