Academic Journal

Impact of Video Compression and Multimodal Embedding on Scene Description

التفاصيل البيبلوغرافية
العنوان: Impact of Video Compression and Multimodal Embedding on Scene Description
المؤلفون: Jin Young Lee
المصدر: Electronics; Volume 8; Issue 9; Pages: 963
بيانات النشر: Multidisciplinary Digital Publishing Institute
سنة النشر: 2019
المجموعة: MDPI Open Access Publishing
مصطلحات موضوعية: deep learning, video compression, multimodal embedding, scene description
الوصف: Scene description refers to the automatic generation of natural language descriptions from videos. In general, deep learning-based scene description networks utilize multimodalities, such as image, motion, audio, and label information, to improve the description quality. In particular, image information plays an important role in scene description. However, scene description has a potential issue, because it may handle images with severe compression artifacts. Hence, this paper analyzes the impact of video compression on scene description, and then proposes a simple network that is robust to compression artifacts. In addition, a network cascading more encoding layers for efficient multimodal embedding is also proposed. Experimental results show that the proposed network is more efficient than conventional networks.
نوع الوثيقة: text
وصف الملف: application/pdf
اللغة: English
Relation: Computer Science & Engineering; https://dx.doi.org/10.3390/electronics8090963
DOI: 10.3390/electronics8090963
الاتاحة: https://doi.org/10.3390/electronics8090963
Rights: https://creativecommons.org/licenses/by/4.0/
رقم الانضمام: edsbas.B00CFD4
قاعدة البيانات: BASE
الوصف
DOI:10.3390/electronics8090963