Summary
At first blush entities and concepts such as “Dutch East India Company” or “coffee” may seem straightforward, but in fact they are complex and multifaceted. The wealth of digital sources presents the massive potential to study these notions at an unprecedented scale. However, current technologies for distant reading are not capable of dealing with this.
TRIFECTA aims to create a database that describes complex entities and concepts and their contexts by combining language and semantic web technology to extract and relate information from different texts over time. In addition, a key aim of TRIFECTA is to advance the state of the art in these technologies to deal with change over time and connections to many different narratives. Sophisticated knowledge representation methods from the semantic web can mitigate the failing that many language technology methods do not incorporate enough background knowledge to recognise and interpret complex entities and concepts in their historical contexts. By treating them as rich networks (or graphs) of knowledge that can express change and relationships to different concepts in space and time, semantic databases can handle the complexity needed to make the outputs of language technology tools suited to humanities research.
Via two use cases, I identify a set of core contentious entities and concepts in maritime and food history. Next, through a data-driven, iterative approach, I advance beyond the state-of-the-art in natural language technology for the humanities by targeting three key aspects of the recognition and modelling of complex concepts (i.e. identity, change, and the long tail). I propose a novel peer-evaluation approach in which a team of humanities scholars, computational linguists, and semantic web researchers collaborate closely to create truly hybrid artificial intelligence systems that will enable humanities research to scale to big data without losing sight of the contextual complexity.
TRIFECTA aims to create a database that describes complex entities and concepts and their contexts by combining language and semantic web technology to extract and relate information from different texts over time. In addition, a key aim of TRIFECTA is to advance the state of the art in these technologies to deal with change over time and connections to many different narratives. Sophisticated knowledge representation methods from the semantic web can mitigate the failing that many language technology methods do not incorporate enough background knowledge to recognise and interpret complex entities and concepts in their historical contexts. By treating them as rich networks (or graphs) of knowledge that can express change and relationships to different concepts in space and time, semantic databases can handle the complexity needed to make the outputs of language technology tools suited to humanities research.
Via two use cases, I identify a set of core contentious entities and concepts in maritime and food history. Next, through a data-driven, iterative approach, I advance beyond the state-of-the-art in natural language technology for the humanities by targeting three key aspects of the recognition and modelling of complex concepts (i.e. identity, change, and the long tail). I propose a novel peer-evaluation approach in which a team of humanities scholars, computational linguists, and semantic web researchers collaborate closely to create truly hybrid artificial intelligence systems that will enable humanities research to scale to big data without losing sight of the contextual complexity.
Unfold all
/
Fold all
More information & hyperlinks
Web resources: | https://cordis.europa.eu/project/id/101088548 |
Start date: | 01-11-2023 |
End date: | 31-10-2028 |
Total budget - Public funding: | 1 998 351,00 Euro - 1 998 351,00 Euro |
Cordis data
Original description
At first blush entities and concepts such as “Dutch East India Company” or “coffee” may seem straightforward, but in fact they are complex and multifaceted. The wealth of digital sources presents the massive potential to study these notions at an unprecedented scale. However, current technologies for distant reading are not capable of dealing with this.TRIFECTA aims to create a database that describes complex entities and concepts and their contexts by combining language and semantic web technology to extract and relate information from different texts over time. In addition, a key aim of TRIFECTA is to advance the state of the art in these technologies to deal with change over time and connections to many different narratives. Sophisticated knowledge representation methods from the semantic web can mitigate the failing that many language technology methods do not incorporate enough background knowledge to recognise and interpret complex entities and concepts in their historical contexts. By treating them as rich networks (or graphs) of knowledge that can express change and relationships to different concepts in space and time, semantic databases can handle the complexity needed to make the outputs of language technology tools suited to humanities research.
Via two use cases, I identify a set of core contentious entities and concepts in maritime and food history. Next, through a data-driven, iterative approach, I advance beyond the state-of-the-art in natural language technology for the humanities by targeting three key aspects of the recognition and modelling of complex concepts (i.e. identity, change, and the long tail). I propose a novel peer-evaluation approach in which a team of humanities scholars, computational linguists, and semantic web researchers collaborate closely to create truly hybrid artificial intelligence systems that will enable humanities research to scale to big data without losing sight of the contextual complexity.
Status
SIGNEDCall topic
ERC-2022-COGUpdate Date
31-07-2023
Images
No images available.
Geographical location(s)