Dissertation/ Thesis

Optimizacija procesa odabira značajki u strojnom učenju ; Optimization of Feature Selection Process in Machine Learning

التفاصيل البيبلوغرافية
العنوان: Optimizacija procesa odabira značajki u strojnom učenju ; Optimization of Feature Selection Process in Machine Learning
المؤلفون: Kumir, Marko
المساهمون: Zaharija, Goran
بيانات النشر: Sveučilište u Splitu. Prirodoslovno-matematički fakultet.
University of Split. Faculty of Science.
سنة النشر: 2024
المجموعة: Croatian Digital Theses Repository (National and University Library in Zagreb)
مصطلحات موضوعية: odabir značajki, strojno učenje, kreditni rizik, filter metode, ugrađene metode, SMOTE, target encoding, redukcija dimenzionalnosti, feature selection, machine learning, credit risk, filter methods, embedded methods, dimensionality reduction, TEHNIČKE ZNANOSTI. Računarstvo, TECHNICAL SCIENCES. Computing
الوصف: Pravilno odabrane značajke igraju ključnu ulogu u poboljšanju performansi modela strojnog učenja, smanjenju složenosti podataka i ubrzavanju procesa treniranja. Nakon detaljnog pregleda literature provedbom PRISMA smjernica, fokus je stavljen na financijsku domenu, posebno na procjenu kreditnog rizika u bankarstvu. U eksperimentalnom dijelu rada testirane su različite filter metode i ugrađene metode, pri čemu je testiranje provedeno na tri kreditna skupa podataka. S ciljem optimizacije procesa, testiranje je izvršeno na 50% najutjecajnijih značajki. Rezultati su pokazali da su RF i XGB modeli, trenirani na reduciranim skupovima podataka, ostvarili visoke performanse u usporedbi s modelima treniranim na inicijalnim skupovima podataka ; Properly selected features play a crucial role in improving the performance of machine learning models, reducing data complexity, and speeding up the training process. After a detailed literature review conducted using PRISMA guidelines, the focus was placed on the financial domain, specifically on credit risk assessment in banking. In the experimental part of the study, various filter methods and embedded methods were tested, with experiments conducted on three credit datasets. To optimize the process, testing was carried out on 50% of the most relevant features. The results showed that the RF and XGB models, trained on reduced datasets, achieved high performance compared to models trained on the initial datasets.
نوع الوثيقة: master thesis
وصف الملف: application/pdf
اللغة: Croatian
Relation: https://zir.nsk.hr/islandora/object/pmfst:1984; https://urn.nsk.hr/urn:nbn:hr:166:569220; https://repozitorij.svkst.unist.hr/islandora/object/pmfst:1984; https://repozitorij.svkst.unist.hr/islandora/object/pmfst:1984/datastream/PDF
الاتاحة: https://zir.nsk.hr/islandora/object/pmfst:1984
https://urn.nsk.hr/urn:nbn:hr:166:569220
https://repozitorij.svkst.unist.hr/islandora/object/pmfst:1984
https://repozitorij.svkst.unist.hr/islandora/object/pmfst:1984/datastream/PDF
Rights: http://rightsstatements.org/vocab/InC/1.0/ ; info:eu-repo/semantics/openAccess
رقم الانضمام: edsbas.883D1772
قاعدة البيانات: BASE