Detalle Publicación

ARTÍCULO

Unsupervised ensemble learning for genome sequencing

Autores: Pages-Zamora, A. (Autor de correspondencia); Ochoa Álvarez, Idoia; Ruiz-Cavero, G.; Villalvilla-Ornat, P.
Título de la revista: PATTERN RECOGNITION
ISSN: 0031-3203
Volumen: 129
Páginas: 108721
Fecha de publicación: 2022
Resumen:
Unsupervised ensemble learning refers to methods devised for a particular task that combine data pro-vided by decision learners taking into account their reliability, which is usually inferred from the data. Here, the variant calling step of the next generation sequencing technologies is formulated as an unsuper-vised ensemble classification problem. A variant calling algorithm based on the expectation-maximization algorithm is further proposed that estimates the maximum-a-posteriori decision among a number of classes larger than the number of different labels provided by the learners. Experimental results with real human DNA sequencing data show that the proposed algorithm is competitive compared to state-of -the-art variant callers as GATK, HTSLIB, and Platypus.(c) 2022 The Author(s). Published by Elsevier Ltd.This is an open access article under the CC BY-NC-ND license ( http://creativecommons.org/licenses/by-nc-nd/4.0/ )
Impacto: