Impact of LLM-based Review Comment Generation in Practice: A Mixed Open-/Closed-source User Study

التفاصيل البيبلوغرافية
العنوان:	Impact of LLM-based Review Comment Generation in Practice: A Mixed Open-/Closed-source User Study
المؤلفون:	Olewicki, Doriane, Da Silva, Leuson, Mujahid, Suhaib, Amini, Arezou, Mah, Benjamin, Castelluccio, Marco, Habchi, Sarra, Khomh, Foutse, Adams, Bram
سنة النشر:	2024
المجموعة:	Computer Science
مصطلحات موضوعية:	Computer Science - Software Engineering
الوصف:	We conduct a large-scale empirical user study in a live setup to evaluate the acceptance of LLM-generated comments and their impact on the review process. This user study was performed in two organizations, Mozilla (which has its codebase available as open source) and Ubisoft (fully closed-source). Inside their usual review environment, participants were given access to RevMate, an LLM-based assistive tool suggesting generated review comments using an off-the-shelf LLM with Retrieval Augmented Generation to provide extra code and review context, combined with LLM-as-a-Judge, to auto-evaluate the generated comments and discard irrelevant cases. Based on more than 587 patch reviews provided by RevMate, we observed that 8.1% and 7.2%, respectively, of LLM-generated comments were accepted by reviewers in each organization, while 14.6% and 20.5% other comments were still marked as valuable as review or development tips. Refactoring-related comments are more likely to be accepted than Functional comments (18.2% and 18.6% compared to 4.8% and 5.2%). The extra time spent by reviewers to inspect generated comments or edit accepted ones (36/119), yielding an overall median of 43s per patch, is reasonable. The accepted generated comments are as likely to yield future revisions of the revised patch as human-written comments (74% vs 73% at chunk-level). Comment: 12pages
نوع الوثيقة:	Working Paper
URL الوصول:	http://arxiv.org/abs/2411.07091
رقم الانضمام:	edsarx.2411.07091
قاعدة البيانات:	arXiv

View record in Arxiv

الوصف
الوصف غير متاح.