Academic Journal

Multitask defect prediction

التفاصيل البيبلوغرافية
العنوان: Multitask defect prediction
المؤلفون: Ni, Chao, Chen, Xiang, Xia, Xin, Gu, Qing, Zhao, Yingquan
المساهمون: China Scholarship Council, National Natural Science Foundation of China
المصدر: Journal of Software: Evolution and Process ; volume 31, issue 12 ; ISSN 2047-7473 2047-7481
بيانات النشر: Wiley
سنة النشر: 2019
المجموعة: Wiley Online Library (Open Access Articles via Crossref)
الوصف: Within‐project defect prediction assumes that we have sufficient labeled data from the same project, while cross‐project defect prediction assumes that we have plenty of labeled data from source projects. However, in practice, we might only have limited labeled data from both the source and target projects in some scenarios. In this paper, we want to apply multitask learning to investigate such a new scenario. To our best knowledge, this problem (ie, both the source project and the target project have limited labeled data) has not been thoroughly investigated, and we are the first to propose a novel multitask defect prediction approach mask . mask consists of a differential evolution optimization phase and a multitask learning phase. The former phase aims to find optimal weights for shared and nonshared information in related projects (ie, the target project and its related source projects), while the latter phase builds prediction models for each project simultaneously. To verify the effectiveness of mask , we perform experimental studies on 18 real‐world software projects and compare our approach with four state‐of‐the‐art baseline approaches: single‐task learning (STL), simple combined learning (SCL), Peters filter, and Burak filter. Experimental results show that mask can achieve F1 of 0.397 and AUC of 0.608 on average with a few labeled data (ie, 10% of data). Across the 18 projects, mask can outperform baseline methods significantly in terms of F1 and AUC. Therefore, by utilizing the relatedness among multiple projects, mask can perform significantly better than the state‐of‐the‐art methods. The results confirm that mask is promising for software defect prediction when the source and target projects both have limited training data.
نوع الوثيقة: article in journal/newspaper
اللغة: English
DOI: 10.1002/smr.2203
الاتاحة: http://dx.doi.org/10.1002/smr.2203
https://onlinelibrary.wiley.com/doi/pdf/10.1002/smr.2203
https://onlinelibrary.wiley.com/doi/full-xml/10.1002/smr.2203
Rights: http://onlinelibrary.wiley.com/termsAndConditions#vor
رقم الانضمام: edsbas.A355FCF3
قاعدة البيانات: BASE