Dissertation/ Thesis
Optimizacija procesa odabira značajki u strojnom učenju ; Optimization of Feature Selection Process in Machine Learning
العنوان: | Optimizacija procesa odabira značajki u strojnom učenju ; Optimization of Feature Selection Process in Machine Learning |
---|---|
المؤلفون: | Kumir, Marko |
المساهمون: | Zaharija, Goran |
بيانات النشر: | Sveučilište u Splitu. Prirodoslovno-matematički fakultet. University of Split. Faculty of Science. |
سنة النشر: | 2024 |
المجموعة: | Croatian Digital Theses Repository (National and University Library in Zagreb) |
مصطلحات موضوعية: | odabir značajki, strojno učenje, kreditni rizik, filter metode, ugrađene metode, SMOTE, target encoding, redukcija dimenzionalnosti, feature selection, machine learning, credit risk, filter methods, embedded methods, dimensionality reduction, TEHNIČKE ZNANOSTI. Računarstvo, TECHNICAL SCIENCES. Computing |
الوصف: | Pravilno odabrane značajke igraju ključnu ulogu u poboljšanju performansi modela strojnog učenja, smanjenju složenosti podataka i ubrzavanju procesa treniranja. Nakon detaljnog pregleda literature provedbom PRISMA smjernica, fokus je stavljen na financijsku domenu, posebno na procjenu kreditnog rizika u bankarstvu. U eksperimentalnom dijelu rada testirane su različite filter metode i ugrađene metode, pri čemu je testiranje provedeno na tri kreditna skupa podataka. S ciljem optimizacije procesa, testiranje je izvršeno na 50% najutjecajnijih značajki. Rezultati su pokazali da su RF i XGB modeli, trenirani na reduciranim skupovima podataka, ostvarili visoke performanse u usporedbi s modelima treniranim na inicijalnim skupovima podataka ; Properly selected features play a crucial role in improving the performance of machine learning models, reducing data complexity, and speeding up the training process. After a detailed literature review conducted using PRISMA guidelines, the focus was placed on the financial domain, specifically on credit risk assessment in banking. In the experimental part of the study, various filter methods and embedded methods were tested, with experiments conducted on three credit datasets. To optimize the process, testing was carried out on 50% of the most relevant features. The results showed that the RF and XGB models, trained on reduced datasets, achieved high performance compared to models trained on the initial datasets. |
نوع الوثيقة: | master thesis |
وصف الملف: | application/pdf |
اللغة: | Croatian |
Relation: | https://zir.nsk.hr/islandora/object/pmfst:1984; https://urn.nsk.hr/urn:nbn:hr:166:569220; https://repozitorij.svkst.unist.hr/islandora/object/pmfst:1984; https://repozitorij.svkst.unist.hr/islandora/object/pmfst:1984/datastream/PDF |
الاتاحة: | https://zir.nsk.hr/islandora/object/pmfst:1984 https://urn.nsk.hr/urn:nbn:hr:166:569220 https://repozitorij.svkst.unist.hr/islandora/object/pmfst:1984 https://repozitorij.svkst.unist.hr/islandora/object/pmfst:1984/datastream/PDF |
Rights: | http://rightsstatements.org/vocab/InC/1.0/ ; info:eu-repo/semantics/openAccess |
رقم الانضمام: | edsbas.883D1772 |
قاعدة البيانات: | BASE |
الوصف غير متاح. |