SEGMENT | 3D scene understanding in two glances

Summary
The human mind understands visual scenes. We can usually tell what objects are present in a scene, we can imagine what the hidden parts of objects look like, and we can imagine what it would look like if we or an object moved. The first step of visual scene understanding is segmentation, in which our brain tries to infer which parts of the scene belong to which objects. Adults can do this in photographs – but photographs are not how we learned to see as infants. We learned to see by moving around in a 3D world. The way that scenes project into our eyes, how light is affected by the optics of our eyes, how our photoreceptors sample the light, and how we move our eyes all provide rich information about our environment. However, we do not know how adults combine all this information to perceive segmented scenes, and we do not know how infants learn this combination. Two reasons for this are that standard visual display devices cannot precisely mimic these factors, and that it is unethical to manipulate these factors in human infants. The goals of this project are to understand how adults use the rich information present in active 3D vision to perform segmentation, and to understand how this is learned. We will develop a new display device and experimental methods to study how adults segment scenes when realistic visual information is available, and develop ground-breaking new technologies using advanced computer graphics and machine learning to simulate the inputs to the visual system from early development to adulthood. We will then conduct in silico experiments in artificial neural networks to understand segmentation learning, by systematically restricting or manipulating different factors. We will compare the learned behaviours of different artificial networks to adults performing segmentation during active exploration of 3D scenes, and use similarities and differences to better understand a fundamental puzzle of perception: how the mind makes sense of scenes.
Unfold all
/
Fold all
More information & hyperlinks
Web resources: https://cordis.europa.eu/project/id/101086774
Start date: 01-11-2023
End date: 31-10-2028
Total budget - Public funding: 2 126 444,00 Euro - 2 126 444,00 Euro
Cordis data

Original description

The human mind understands visual scenes. We can usually tell what objects are present in a scene, we can imagine what the hidden parts of objects look like, and we can imagine what it would look like if we or an object moved. The first step of visual scene understanding is segmentation, in which our brain tries to infer which parts of the scene belong to which objects. Adults can do this in photographs – but photographs are not how we learned to see as infants. We learned to see by moving around in a 3D world. The way that scenes project into our eyes, how light is affected by the optics of our eyes, how our photoreceptors sample the light, and how we move our eyes all provide rich information about our environment. However, we do not know how adults combine all this information to perceive segmented scenes, and we do not know how infants learn this combination. Two reasons for this are that standard visual display devices cannot precisely mimic these factors, and that it is unethical to manipulate these factors in human infants. The goals of this project are to understand how adults use the rich information present in active 3D vision to perform segmentation, and to understand how this is learned. We will develop a new display device and experimental methods to study how adults segment scenes when realistic visual information is available, and develop ground-breaking new technologies using advanced computer graphics and machine learning to simulate the inputs to the visual system from early development to adulthood. We will then conduct in silico experiments in artificial neural networks to understand segmentation learning, by systematically restricting or manipulating different factors. We will compare the learned behaviours of different artificial networks to adults performing segmentation during active exploration of 3D scenes, and use similarities and differences to better understand a fundamental puzzle of perception: how the mind makes sense of scenes.

Status

SIGNED

Call topic

ERC-2022-COG

Update Date

31-07-2023
Images
No images available.
Geographical location(s)
Structured mapping
Unfold all
/
Fold all
Horizon Europe
HORIZON.1 Excellent Science
HORIZON.1.1 European Research Council (ERC)
HORIZON.1.1.0 Cross-cutting call topics
ERC-2022-COG ERC CONSOLIDATOR GRANTS
HORIZON.1.1.1 Frontier science
ERC-2022-COG ERC CONSOLIDATOR GRANTS