NESTOR | Next gEneration Sequence sTORage

Summary
Sequential data are everywhere, from DNA sequences to astronomical light curves, and from aircraft engine monitoring data to the prices of stock options. Recent advances in various fields such as those of data storage, networking and sensing technologies, have allowed organizations to gather overwhelming amounts of sequential data at unprecedented speeds.
This wealth of information enables analysts to identify patterns, find abnormalities, and extract knowledge. It is noteworthy that common practice in various domains is to use custom data analysis solutions, usually built using higher level programming languages, such as R/Python. Such techniques, however, while commonly acceptable in small data processing scenarios, are unfit for larger scale data management and exploration. This is because they come in contrast to all previous database research, not taking advantage of indexes, physical data independence, query optimization, and data processing methods, designed for scalability. In these domains, database systems are used merely for storing and retrieving data and not as the sophisticated query processing systems they are.
Current relational storage layers cannot handle the access patterns that analysts of sequential data are interested in, without scanning large amounts of unnecessary data or without large processing overhead. Thus, making complex analytics inefficient.
In order to exploit this new opportunity, we plan to develop specialized data series storage and retrieval systems, which will allow analysts – across different fields – to efficiently manipulate the sequences of interest.
The proposed research project, named NESTOR (Next gEneration Sequence sTORage), has the potential of great economic and social impact in Europe as multiple scientific and industrial fields are currently in need of the right tools, in order to handle their massive collections of data series.
Unfold all
/
Fold all
More information & hyperlinks
Web resources: https://cordis.europa.eu/project/id/748945
Start date: 01-09-2017
End date: 31-08-2020
Total budget - Public funding: 246 668,40 Euro - 246 668,00 Euro
Cordis data

Original description

Sequential data are everywhere, from DNA sequences to astronomical light curves, and from aircraft engine monitoring data to the prices of stock options. Recent advances in various fields such as those of data storage, networking and sensing technologies, have allowed organizations to gather overwhelming amounts of sequential data at unprecedented speeds.
This wealth of information enables analysts to identify patterns, find abnormalities, and extract knowledge. It is noteworthy that common practice in various domains is to use custom data analysis solutions, usually built using higher level programming languages, such as R/Python. Such techniques, however, while commonly acceptable in small data processing scenarios, are unfit for larger scale data management and exploration. This is because they come in contrast to all previous database research, not taking advantage of indexes, physical data independence, query optimization, and data processing methods, designed for scalability. In these domains, database systems are used merely for storing and retrieving data and not as the sophisticated query processing systems they are.
Current relational storage layers cannot handle the access patterns that analysts of sequential data are interested in, without scanning large amounts of unnecessary data or without large processing overhead. Thus, making complex analytics inefficient.
In order to exploit this new opportunity, we plan to develop specialized data series storage and retrieval systems, which will allow analysts – across different fields – to efficiently manipulate the sequences of interest.
The proposed research project, named NESTOR (Next gEneration Sequence sTORage), has the potential of great economic and social impact in Europe as multiple scientific and industrial fields are currently in need of the right tools, in order to handle their massive collections of data series.

Status

CLOSED

Call topic

MSCA-IF-2016

Update Date

28-04-2024
Images
No images available.
Geographical location(s)
Structured mapping
Unfold all
/
Fold all
Horizon 2020
H2020-EU.1. EXCELLENT SCIENCE
H2020-EU.1.3. EXCELLENT SCIENCE - Marie Skłodowska-Curie Actions (MSCA)
H2020-EU.1.3.2. Nurturing excellence by means of cross-border and cross-sector mobility
H2020-MSCA-IF-2016
MSCA-IF-2016