Report
Impact of LLM-based Review Comment Generation in Practice: A Mixed Open-/Closed-source User Study
العنوان: | Impact of LLM-based Review Comment Generation in Practice: A Mixed Open-/Closed-source User Study |
---|---|
المؤلفون: | Olewicki, Doriane, Da Silva, Leuson, Mujahid, Suhaib, Amini, Arezou, Mah, Benjamin, Castelluccio, Marco, Habchi, Sarra, Khomh, Foutse, Adams, Bram |
سنة النشر: | 2024 |
المجموعة: | Computer Science |
مصطلحات موضوعية: | Computer Science - Software Engineering |
الوصف: | We conduct a large-scale empirical user study in a live setup to evaluate the acceptance of LLM-generated comments and their impact on the review process. This user study was performed in two organizations, Mozilla (which has its codebase available as open source) and Ubisoft (fully closed-source). Inside their usual review environment, participants were given access to RevMate, an LLM-based assistive tool suggesting generated review comments using an off-the-shelf LLM with Retrieval Augmented Generation to provide extra code and review context, combined with LLM-as-a-Judge, to auto-evaluate the generated comments and discard irrelevant cases. Based on more than 587 patch reviews provided by RevMate, we observed that 8.1% and 7.2%, respectively, of LLM-generated comments were accepted by reviewers in each organization, while 14.6% and 20.5% other comments were still marked as valuable as review or development tips. Refactoring-related comments are more likely to be accepted than Functional comments (18.2% and 18.6% compared to 4.8% and 5.2%). The extra time spent by reviewers to inspect generated comments or edit accepted ones (36/119), yielding an overall median of 43s per patch, is reasonable. The accepted generated comments are as likely to yield future revisions of the revised patch as human-written comments (74% vs 73% at chunk-level). Comment: 12pages |
نوع الوثيقة: | Working Paper |
URL الوصول: | http://arxiv.org/abs/2411.07091 |
رقم الانضمام: | edsarx.2411.07091 |
قاعدة البيانات: | arXiv |
الوصف غير متاح. |