RAID: A Shared Benchmark for Robust Evaluation of Machine-Generated Text Detectors

التفاصيل البيبلوغرافية
العنوان:	RAID: A Shared Benchmark for Robust Evaluation of Machine-Generated Text Detectors
المؤلفون:	Dugan, Liam, Hwang, Alyssa, Trhlik, Filip, Ludan, Josh Magnus, Zhu, Andrew, Xu, Hainiu, Ippolito, Daphne, Callison-Burch, Chris
سنة النشر:	2024
المجموعة:	Computer Science
مصطلحات موضوعية:	Computer Science - Computation and Language, I.2.7
الوصف:	Many commercial and open-source models claim to detect machine-generated text with extremely high accuracy (99% or more). However, very few of these detectors are evaluated on shared benchmark datasets and even when they are, the datasets used for evaluation are insufficiently challenging-lacking variations in sampling strategy, adversarial attacks, and open-source generative models. In this work we present RAID: the largest and most challenging benchmark dataset for machine-generated text detection. RAID includes over 6 million generations spanning 11 models, 8 domains, 11 adversarial attacks and 4 decoding strategies. Using RAID, we evaluate the out-of-domain and adversarial robustness of 8 open- and 4 closed-source detectors and find that current detectors are easily fooled by adversarial attacks, variations in sampling strategies, repetition penalties, and unseen generative models. We release our data along with a leaderboard to encourage future research. Comment: ACL 2024
نوع الوثيقة:	Working Paper
URL الوصول:	http://arxiv.org/abs/2405.07940
رقم الانضمام:	edsarx.2405.07940
قاعدة البيانات:	arXiv

View record in Arxiv

الوصف
الوصف غير متاح.