Text Style Transfer Back-Translation

التفاصيل البيبلوغرافية
العنوان: Text Style Transfer Back-Translation
المؤلفون: Wei, Daimeng, Wu, Zhanglin, Shang, Hengchao, Li, Zongyao, Wang, Minghan, Guo, Jiaxin, Chen, Xiaoyu, Yu, Zhengzhe, Yang, Hao
سنة النشر: 2023
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Computation and Language, Computer Science - Machine Learning
الوصف: Back Translation (BT) is widely used in the field of machine translation, as it has been proved effective for enhancing translation quality. However, BT mainly improves the translation of inputs that share a similar style (to be more specific, translation-like inputs), since the source side of BT data is machine-translated. For natural inputs, BT brings only slight improvements and sometimes even adverse effects. To address this issue, we propose Text Style Transfer Back Translation (TST BT), which uses a style transfer model to modify the source side of BT data. By making the style of source-side text more natural, we aim to improve the translation of natural inputs. Our experiments on various language pairs, including both high-resource and low-resource ones, demonstrate that TST BT significantly improves translation performance against popular BT benchmarks. In addition, TST BT is proved to be effective in domain adaptation so this strategy can be regarded as a general data augmentation method. Our training code and text style transfer model are open-sourced.
Comment: acl2023, 14 pages, 4 figures, 19 tables
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2306.01318
رقم الانضمام: edsarx.2306.01318
قاعدة البيانات: arXiv