AI4Media-1218698 ViLMA: A Zero-Shot Benchmark for Linguistic and Temporal Grounding in Video-Language Models

Summary

This is a publication. If there is no link to the publication on this page, you can try the pre-formated search via the search engines listed on this page.

Authors: Ilker Kesen, Andrea Pedrotti, Mustafa Dogan, Michele Cafagna, Emre Can Acikgoz, Letitia Parcalabescu, Iacer Calixto, Anette Frank, Albert Gatt, Aykut Erdem, Erkut Erdem

Journal title: Proceedings 12th International Conference on Learning Representations (ICLR2024)

Journal publisher: OpenReview.net

Published year: 2024

Associated projects

AI4Media - A European Excellence Centre for Media, Society and Democracy

Organisations

Not specified