Gaussian Embedding of Linked Documents from a Pretrained Semantic Space

التفاصيل البيبلوغرافية
العنوان: Gaussian Embedding of Linked Documents from a Pretrained Semantic Space
المؤلفون: Gourru, Antoine, Velcin, Julien, Jacques, Julien
المساهمون: Entrepôts, Représentation et Ingénierie des Connaissances (ERIC), Université Lumière - Lyon 2 (UL2)-Université Claude Bernard Lyon 1 (UCBL), Université de Lyon-Université de Lyon
المصدر: Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence (IJCAI-20) ; Twenty-Ninth International Joint Conference on Artificial Intelligence (IJCAI-20) ; https://hal.science/hal-03343904 ; Twenty-Ninth International Joint Conference on Artificial Intelligence (IJCAI-20), Jan 2021, Yokohama (virtual), Japan
بيانات النشر: HAL CCSD
سنة النشر: 2021
المجموعة: HAL Lyon 1 (University Claude Bernard Lyon 1)
مصطلحات موضوعية: [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing
جغرافية الموضوع: Yokohama (virtual), Japan
الوصف: International audience ; Gaussian Embedding of Linked Documents (GELD) is a new method that embeds linked documents (e.g., citation networks) onto a pretrained semantic space (e.g., a set of word embeddings). We formulate the problem in such a way that we model each document as a Gaussian distribution in the word vector space. We design a generative model that combines both words and links in a consistent way. Leveraging the variance of a document allows us to model the uncertainty related to word and link generation. In most cases, our method outperforms state-of-the-art methods when using our document vectors as features for usual downstream tasks. In particular, GELD achieves better accuracy in classification and link prediction on Cora and Dblp. In addition, we demonstrate qualitatively the convenience of several properties of our method. We provide the implementation of GELD and the evaluation datasets to the community (https://github.com/AntoineGourru/DNEmbedding).
نوع الوثيقة: conference object
اللغة: English
Relation: hal-03343904; https://hal.science/hal-03343904; https://hal.science/hal-03343904/document; https://hal.science/hal-03343904/file/IJCAI2020_GELD__Copy_.pdf
الاتاحة: https://hal.science/hal-03343904
https://hal.science/hal-03343904/document
https://hal.science/hal-03343904/file/IJCAI2020_GELD__Copy_.pdf
Rights: info:eu-repo/semantics/OpenAccess
رقم الانضمام: edsbas.BC0A3FA9
قاعدة البيانات: BASE