-
1Dissertation/ Thesis
المؤلفون: Perez Tito, Ruben
Thesis Advisors: Valveny Llobet, Ernest
المصدر: TDX (Tesis Doctorals en Xarxa)
مصطلحات موضوعية: Visió i Llenguatge, Vision and Language, Visión y Lenguaje, Visió per Computador, Computer Vision, Visión por Computador, Resposta de preguntes imatges, Visual question answering, Respuesta de preguntas imagene, Tecnologies
وصف الملف: application/pdf
URL الوصول: http://hdl.handle.net/10803/691193
-
2Dissertation/ Thesis
المؤلفون: Biten, Ali Furkan
Thesis Advisors: Gomez Bigorda, Luis, Karatzas, Dimosthenis
المصدر: TDX (Tesis Doctorals en Xarxa)
مصطلحات موضوعية: Visió i llenguatge, Visión y lenguaje, Vision and language, Subtítols d’imatges, Subtítulos de imagen, Image captioning, Text de l’escena pregunta visual resposta, Escena texto visual pregunta respuesta, Scene text visual question answering, Ciències Experimentals
وصف الملف: application/pdf
URL الوصول: http://hdl.handle.net/10803/688319
-
3Academic Journal
المؤلفون: Yanjun Sun, Yue Qiu, Yoshimitsu Aoki
المصدر: Sensors, Vol 25, Iss 2, p 364 (2025)
مصطلحات موضوعية: vision-and-language navigation, dynamic change, decision-making, Chemical technology, TP1-1185
وصف الملف: electronic resource
-
4Conference
المؤلفون: Liu, Zhi-Song, Wang, Li-Wen, Xiao, Jun, Kalogeiton, Vicky
المساهمون: Lappeenranta–Lahti University of Technology Finlande (LUT), The Hong Kong Polytechnic University Hong Kong (POLYU), Laboratoire d'informatique de l'École polytechnique Palaiseau (LIX), École polytechnique (X), Institut Polytechnique de Paris (IP Paris)-Institut Polytechnique de Paris (IP Paris)-Centre National de la Recherche Scientifique (CNRS), European Computer Vision Association, ANR-22-CE23-0007,WhyBehindScenes,Le Pourquoi dans les films(2022)
المصدر: European Conference on Computer Vision Workshop (ECCV-W) 2024 ; https://hal.science/hal-04822965 ; European Conference on Computer Vision Workshop (ECCV-W) 2024, European Computer Vision Association, Sep 2024, Milan (Italie), Italy
مصطلحات موضوعية: vision and language, text guidance, domain transfer, contrastive learning, Style transfer multimodal learning vision and language text guidance domain transfer contrastive learning, Style transfer, multimodal learning, [INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV]
جغرافية الموضوع: Milan (Italie), Italy
Relation: info:eu-repo/semantics/altIdentifier/arxiv/2410.09566; ARXIV: 2410.09566
-
5Academic Journal
المؤلفون: Chihaya Matsuhira, Marc A. Kastner, Takahiro Komamizu, Takatsugu Hirayama, Keisuke Doman, Yasutomo Kawanishi, Ichiro Ide
المصدر: IEEE Access, Vol 12, Pp 41299-41316 (2024)
مصطلحات موضوعية: Nonwords, phonetics, pronunciation, psycholinguistics, text-to-image generation, vision and language, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
وصف الملف: electronic resource
-
6Academic Journal
المؤلفون: Jiahao Tang, Jianguo Hu, Wenjun Huang, Shengzhi Shen, Jiakai Pan, Deming Wang, Yanyu Ding
المصدر: IEEE Access, Vol 12, Pp 131664-131680 (2024)
مصطلحات موضوعية: Video question answering (VideoQA), video reasoning and description, spatial-temporal graph, dynamic graph Transformer, graph attention, computer vision natural language processing, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
وصف الملف: electronic resource
-
7Academic Journal
المؤلفون: Soohyeong Kim, Sangjun Lee, Yong Suk Choi
المصدر: IEEE Access, Vol 12, Pp 165822-165830 (2024)
مصطلحات موضوعية: Compositional zero-shot learning, open-world recognition, representation learning, vision and language, zero-shot learning, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
وصف الملف: electronic resource
-
8Academic Journal
المؤلفون: Tianwei Chen, Noa Garcia, Mayu Otani, Chenhui Chu, Yuta Nakashima, Hajime Nagahara
المصدر: Journal of Imaging, Vol 10, Iss 12, p 300 (2024)
مصطلحات موضوعية: vision and language, knowledge transferability analysis, multi-modal learning, Photography, TR1-1050, Computer applications to medicine. Medical informatics, R858-859.7, Electronic computers. Computer science, QA75.5-76.95
وصف الملف: electronic resource
-
9Conference
المؤلفون: Dayou Chen, Long Chen, Yiheng Zeng, Craig Hancock, Russell Lock, Simon Solvsten
مصطلحات موضوعية: Fire Safety Compliance, Automated Compliance Checking (ACC), Vision Large Language Models (vLLM), Visual Question Answering (VQA), Computer Vision, Operational Phase Monitoring
Relation: 2134/28023164.v1
-
10Conference
المؤلفون: Poppi, Samuele, Poppi, Tobia, Cocchi, Federico, Cornia, Marcella, Baraldi, Lorenzo, Cucchiara, Rita
المصدر: ECCV, European Conference on Computer Vision, Milan, 29 September - 4 October
مصطلحات موضوعية: Trustworthy AI, Vision-and-Language, NSFW Concepts
Relation: https://zenodo.org/communities/elias_project; https://zenodo.org/communities/eu; https://doi.org/10.5281/zenodo.14015235; https://doi.org/10.5281/zenodo.14015236; oai:zenodo.org:14015236
-
11
المؤلفون: Hagström, Lovisa, 1995, Johansson, Richard, 1975
المصدر: 29th International Conference on Computational Linguistics, COLING 2022, Gyeongju, South Korea International Conference on Computational Linguistics . Proceedings. 29(1):5582-5596
مصطلحات موضوعية: multimodal models, NLP, vision-and-language-models, language understanding, text-only tasks
وصف الملف: electronic
-
12Academic Journal
المؤلفون: Yunzhe Xiao, Yong Dou, Shaowu Yang
المصدر: Remote Sensing, Vol 16, Iss 13, p 2453 (2024)
مصطلحات موضوعية: point cloud classification, zero-training, large-scale vision-and-language model, zero-shot classification, few-shot classification, Science
وصف الملف: electronic resource
-
13Academic Journal
المؤلفون: Yusuke Hirota, Noa Garcia, Mayu Otani, Chenhui Chu, Yuta Nakashima
المصدر: Electronics ; Volume 13 ; Issue 21 ; Pages: 4290
مصطلحات موضوعية: visual question answering, textual representations, data augmentation, interpretability, vision-and-language
وصف الملف: application/pdf
Relation: Computer Science & Engineering; https://dx.doi.org/10.3390/electronics13214290
-
14Academic Journal
المصدر: Electronics ; Volume 13 ; Issue 22 ; Pages: 4564
مصطلحات موضوعية: child-friendly, urban analysis, urban vitality, multi-source data, vision large language models
وصف الملف: application/pdf
Relation: Artificial Intelligence; https://dx.doi.org/10.3390/electronics13224564
-
15Academic Journal
المساهمون: Sarto, Sara, Cornia, Marcella, Baraldi, Lorenzo, Nicolosi, Alessandro, Cucchiara, Rita
مصطلحات موضوعية: Additional Key Words and PhrasesImage captioning, image retrieval, vision-and-language
Relation: volume:20; issue:8; firstpage:1; lastpage:22; journal:ACM TRANSACTIONS ON MULTIMEDIA COMPUTING, COMMUNICATIONS AND APPLICATIONS; https://hdl.handle.net/11380/1337206
-
16
المؤلفون: Zhu, Wanrong
مصطلحات موضوعية: Computer science, Artificial intelligence, Computer Vision, Large Multimodal Model, Machine Learning, Multimodal Study, Natural Language Processing, Vision and Language
وصف الملف: application/pdf
-
17
المؤلفون: Hagström, Lovisa, 1995
مصطلحات موضوعية: NLP, Knowledge representation, Neural network, Vision-and-language models, Grounding, BERT
وصف الملف: electronic
-
18Academic Journal
المؤلفون: Akiyoshi Tomihari, Hitomi Yanaka
المصدر: IEEE Access, Vol 11, Pp 45645-45656 (2023)
مصطلحات موضوعية: Natural language processing, recognizing textual entailment, vision and language, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
وصف الملف: electronic resource
-
19Academic Journal
المؤلفون: Lanxiao Wang, Wenzhe Hu, Heqian Qiu, Chao Shang, Taijin Zhao, Benliu Qiu, King Ngi Ngan, Hongliang Li
المصدر: CAAI Artificial Intelligence Research, Vol 1, Iss 2, Pp 111-136 (2022)
مصطلحات موضوعية: deep learning, vision and language, multi-modal generation, multi-modal analysis, multi-modal reasoning, pre-training, Electronic computers. Computer science, QA75.5-76.95
وصف الملف: electronic resource
-
20Academic Journal
المؤلفون: Michele Cafagna, Lina M. Rojas-Barahona, Kees van Deemter, Albert Gatt
المصدر: Frontiers in Artificial Intelligence, Vol 6 (2023)
مصطلحات موضوعية: vision and language, multimodality, explainability, image captioning, visual question answering, natural language generation, Electronic computers. Computer science, QA75.5-76.95
وصف الملف: electronic resource