A comprehensive imputation-based evaluation of tag SNP selection strategies

Nguyen Dat Thanh, Dinh Hieu Quang, Vu Giang Minh, Nguyen Duong Thuy, Vo Nam Sy

Publisher

Regardless of the rapid development of sequencing technology, single nucleotide polymorphism (SNP) array has been widely used for many large-scale genomic studies due to its cost-effectiveness. Recently, in parallel with the advancement in imputation strategies, several genotyping platforms for various species have been developed. Despite the importance of imputation accuracy in SNP array design, to the best of our knowledge, there are no systematic studies for evaluating tag SNP selection methods based on this metric. In this paper, using the leave-one-out cross-validation approach on the 1000 genome high-coverage dataset, we comprehensively evaluated four well-known tag SNP selection algorithms based on imputation accuracy. Our results showed that although all widely used methods for SNP array design can provide reasonable imputation accuracy, pairwise linkage disequilibrium based tag SNP selection algorithm achieves the best performance. Our pipelines for running evaluated algorithms and leave-one-out cross-validation are available for public use at https://github.com/datngu/TagSNP_evaluation.

Publisher: Proceedings International Conference on Knowledge and Systems Engineering Kse

ISSN (Electronic): 26944804

Keywords

  • genotyping imputation
  • linkage disequilibrium
  • SNP array design
  • Tag SNP selection

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Networks and Communications
  • Computer Science Applications
  • Information Systems
  • Software
  • Control and Systems Engineering

Publication year

2021

Fingerprint