EFFICIENT FLOWERING CLASSIFICATION BASED ON DEEP LEARNING AND MARKER DATA IN MAIZE INBRED LINES

التفاصيل البيبلوغرافية
العنوان: EFFICIENT FLOWERING CLASSIFICATION BASED ON DEEP LEARNING AND MARKER DATA IN MAIZE INBRED LINES
المؤلفون: Galić, Vlatko, Jambrović, Antun, Brkić, Andrija, Zdunić, Zvonimir, Šimić, Domagoj
المساهمون: anđelković, Violeta, srdić, jelena, nikolić, milica
سنة النشر: 2022
مصطلحات موضوعية: machine learning, neural network, maize, SNP 50k array, flowering time
الوصف: The advent of deep learning methods such as convolutional neural networks (CNN) represents a new avenue in analysis of biological data in the “Breeding 4.0” era. The power of this approach lies in its ability of feature extraction, combined with architecture having layers of interconnected neurons sharing fragments of information. Such information flow coupled with powerful means of dimension reduction based on spatial coherence (linkage disequilibrium) and pooling might provide good method for analysis of dense genotypic data. Total of 1066 maize inbred lines developed at the Agricultural Institute Osijek were screened for distinctness, uniformity and stability (DUS), and their flowering window was classified compared to checks to groups 1 to 7, with earliest inbreds such as CM7 belonging to group 1, most European flints to group 2 PHJ40 to group 3, PHP02 to group 4, Oh43 to group 5, B73 and Mo17 to group 6, and the latest flowering inbreds F118 and HBA1 to group 7. All inbreds were genotyped with Illumina MaizeSNP50 array. Missing and heterozygous positions were filtered (5% and 2.5%) leaving 48734 markers that were imputed with LinkImpute and one-hot recoded. Convolutional neural network was setup with Tensorflow 2 in Python. Model validation with external dataset showed >93% classification accuracy, while all of the ~7% misses were classified to neighboring groups (±1). A priori classification of germplasm can facilitate improvement of germplasm- environment compatibility. Novel machine learning algorithms show promise in analysis of complex nonlinear data and their deployment in breeding programs needs to be further studied.
اللغة: English
URL الوصول: https://explore.openaire.eu/search/publication?articleId=57a035e5b1ae::c2eece6dc34e5ff787188c07bc885acb
https://www.bib.irb.hr/1246366
Rights: CLOSED
رقم الانضمام: edsair.57a035e5b1ae..c2eece6dc34e5ff787188c07bc885acb
قاعدة البيانات: OpenAIRE