It's Just Another Day: Unique Video Captioning by Discriminative Prompting

التفاصيل البيبلوغرافية
العنوان: It's Just Another Day: Unique Video Captioning by Discriminative Prompting
المؤلفون: Perrett, Toby, Han, Tengda, Damen, Dima, Zisserman, Andrew
سنة النشر: 2024
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Computer Vision and Pattern Recognition
الوصف: Long videos contain many repeating actions, events and shots. These repetitions are frequently given identical captions, which makes it difficult to retrieve the exact desired clip using a text search. In this paper, we formulate the problem of unique captioning: Given multiple clips with the same caption, we generate a new caption for each clip that uniquely identifies it. We propose Captioning by Discriminative Prompting (CDP), which predicts a property that can separate identically captioned clips, and use it to generate unique captions. We introduce two benchmarks for unique captioning, based on egocentric footage and timeloop movies - where repeating actions are common. We demonstrate that captions generated by CDP improve text-to-video R@1 by 15% for egocentric videos and 10% in timeloop movies.
Comment: ACCV 2024 Oral. Project page: https://tobyperrett.github.io/its-just-another-day/
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2410.11702
رقم الانضمام: edsarx.2410.11702
قاعدة البيانات: arXiv