ARTÍCULO

Genomic data compression

Autores: Hernáez Arrazola, Mikel (Autor de correspondencia); Pavlichin, D.; Weissman, T.; Ochoa Álvarez, Idoia
Título de la revista: ANNUAL REVIEW OF BIOMEDICAL DATA SCIENCE
ISSN: 2574-3414
Volumen: 2
Páginas: 19 - 37
Fecha de publicación: 2019
Resumen:
Recently, there has been growing interest in genome sequencing, driven by advances in sequencing technology, in terms of both efficiency and affordability. These developments have allowed many to envision whole-genome sequencing as an invaluable tool for both personalized medical care and public health. As a result, increasingly large and ubiquitous genomic data sets are being generated. This poses a significant challenge for the storage and transmission of these data. Already, it is more expensive to store genomic data for a decade than it is to obtain the data in the first place. This situation calls for efficient representations of genomic information. In this review, we emphasize the need for designing specialized compressors tailored to genomic data and describe the main solutions already proposed. We also give general guidelines for storing these data and conclude with our thoughts on the future of genomic formats and compressors.