Report
MI-VisionShot: Few-shot adaptation of vision-language models for slide-level classification of histopathological images
العنوان: | MI-VisionShot: Few-shot adaptation of vision-language models for slide-level classification of histopathological images |
---|---|
المؤلفون: | Meseguer, Pablo, del Amor, Rocío, Naranjo, Valery |
سنة النشر: | 2024 |
المجموعة: | Computer Science |
مصطلحات موضوعية: | Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence |
الوصف: | Vision-language supervision has made remarkable strides in learning visual representations from textual guidance. In digital pathology, vision-language models (VLM), pre-trained on curated datasets of histological image-captions, have been adapted to downstream tasks, such as region of interest classification. Zero-shot transfer for slide-level prediction has been formulated by MI-Zero, but it exhibits high variability depending on the textual prompts. Inspired by prototypical learning, we propose MI-VisionShot, a training-free adaptation method on top of VLMs to predict slide-level labels in few-shot learning scenarios. Our framework takes advantage of the excellent representation learning of VLM to create prototype-based classifiers under a multiple-instance setting by retrieving the most discriminative patches within each slide. Experimentation through different settings shows the ability of MI-VisionShot to surpass zero-shot transfer with lower variability, even in low-shot scenarios. Code coming soon at thttps://github.com/cvblab/MIVisionShot. Comment: Manuscript accepted for oral presentation at KES-InnovationInMedicine 2024 held on Madeira, Portugal |
نوع الوثيقة: | Working Paper |
URL الوصول: | http://arxiv.org/abs/2410.15881 |
رقم الانضمام: | edsarx.2410.15881 |
قاعدة البيانات: | arXiv |
الوصف غير متاح. |