Audience-Centric Natural Language Generation via Style Infusion

التفاصيل البيبلوغرافية
العنوان: Audience-Centric Natural Language Generation via Style Infusion
المؤلفون: Moorjani, Samraj, Krishnan, Adit, Sundaram, Hari, Maslowska, Ewa, Sankar, Aravind
سنة النشر: 2023
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Computation and Language, Computer Science - Machine Learning
الوصف: Adopting contextually appropriate, audience-tailored linguistic styles is critical to the success of user-centric language generation systems (e.g., chatbots, computer-aided writing, dialog systems). While existing approaches demonstrate textual style transfer with large volumes of parallel or non-parallel data, we argue that grounding style on audience-independent external factors is innately limiting for two reasons. First, it is difficult to collect large volumes of audience-specific stylistic data. Second, some stylistic objectives (e.g., persuasiveness, memorability, empathy) are hard to define without audience feedback. In this paper, we propose the novel task of style infusion - infusing the stylistic preferences of audiences in pretrained language generation models. Since humans are better at pairwise comparisons than direct scoring - i.e., is Sample-A more persuasive/polite/empathic than Sample-B - we leverage limited pairwise human judgments to bootstrap a style analysis model and augment our seed set of judgments. We then infuse the learned textual style in a GPT-2 based text generator while balancing fluency and style adoption. With quantitative and qualitative assessments, we show that our infusion approach can generate compelling stylized examples with generic text prompts. The code and data are accessible at https://github.com/CrowdDynamicsLab/StyleInfusion.
Comment: 14 pages, 3 figures, Accepted in Findings of EMNLP 2022
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2301.10283
رقم الانضمام: edsarx.2301.10283
قاعدة البيانات: arXiv