Evaluation of supervised machine-learning methods for predicting appearance traits from DNA
العنوان: | Evaluation of supervised machine-learning methods for predicting appearance traits from DNA |
---|---|
المؤلفون: | Katsara, Maria-Alexandra, Branicki, Wojciech, Walsh, Susan, Kayser, Manfred, Nothnagel, Michael, Ames, Carole E., Bastisch, Ingo, Bouakaze, Caroline, Carra-cedo, Angel, Chantrel, Yann, De la Puente, María, Delest, Anna, Gross, Theresa E., Hedman, Johannes, Heidegger, Antonia, Hollard, Clemence, Junker, Klara, Kalamara, Vivian, Kartasińska, Ewa, Khan, Shazia, Khellaf, Tarek, Lareu, Maria Victoria, Laurent, François-Xavier, Mosquera-Miguel, Ana, Niedersattter, Harald, Parson, Walther, Phillips, Christopher, Pisarek, Aleksandra, Pośpiech, Ewelina, Prainsack, Barbara, Ralf, Arwin, Revoir, Andrew, Samuel, Gabrielle, Schneider, Peter M., Schury, Nathalie, Sidstedt, Maja, Sijen, Titia, Spólnicka, Magdalena, Teodoridis, Jens, Ulus, Ayhan, Unterlander, Martina, Van der Gaag, Kris, Vannier, Julien, Ventayol-Garcia, Marina, Vidaki, Athina, Woźniak, Anna, Xavier, Catarina |
المساهمون: | Genetic Identification |
المصدر: | Forensic Science International: Genetics, 53:102507. Elsevier Ireland Ltd Forensic Science International: Genetics |
سنة النشر: | 2020 |
مصطلحات موضوعية: | 0301 basic medicine, Forensic Genetics, Genetic Markers, Computer science, Datasets as Topic, Skin Pigmentation, Machine learning, computer.software_genre, Polymorphism, Single Nucleotide, Pathology and Forensic Medicine, Machine Learning, 03 medical and health sciences, 0302 clinical medicine, Classifier (linguistics), Genetics, Humans, 030216 legal & forensic medicine, Hair Color, Categorical variable, Hyperparameter, Artificial neural network, Eye Color, business.industry, DNA, Random forest, Support vector machine, 030104 developmental biology, Logistic Models, Phenotype, Trait, Artificial intelligence, business, computer, DNA phenotyping, Algorithms |
الوصف: | The prediction of human externally visible characteristics (EVCs) based solely on DNA information has become an established approach in forensic and anthropological genetics in recent years. While for a large set of EVCs, predictive models have already been established using multinomial logistic regression (MLR), the prediction performances of other possible classification methods have not been thoroughly investigated thus far. Motivated by the question to identify a potential classifier that outperforms these specific trait models, we conducted a systematic comparison between the widely used MLR and three popular machine learning (ML) classifiers, namely support vector machines (SVM), random forest (RF) and artificial neural networks (ANN), that have shown good performance outside EVC prediction. As examples, we used eye, hair and skin color categories as phenotypes and genotypes based on the previously established IrisPlex, HIrisPlex, and HIrisPlex-S DNA markers. We compared and assessed the performances of each of the four methods, complemented by detailed hyperparameter tuning that was applied to some of the methods in order to maximize their performance. Overall, we observed that all four classification methods showed rather similar performance, with no method being substantially superior to the others for any of the traits, although performances varied slightly across the different traits and more so across the trait categories. Hence, based on our findings, none of the ML methods applied here provide any advantage on appearance prediction, at least when it comes to the categorical pigmentation traits and the selected DNA markers used here. |
تدمد: | 1878-0326 1872-4973 |
URL الوصول: | https://explore.openaire.eu/search/publication?articleId=doi_dedup___::5faa155f7e90b26aad49488cce5012ac https://pubmed.ncbi.nlm.nih.gov/33831816 |
Rights: | OPEN |
رقم الانضمام: | edsair.doi.dedup.....5faa155f7e90b26aad49488cce5012ac |
قاعدة البيانات: | OpenAIRE |
تدمد: | 18780326 18724973 |
---|