Background In the last years, high throughput genotyping techniques have widely used for the characterization of new genetic variants associated with rheumatic diseases. Amongst these, Illumina Beadchip microarrays have been widely used, being able to scan from 100,000 to >2,000,000 genome markers in one single assay. Although this technology was principally designed for Single Nucleotide Polymorphism genotyping, several methods have been developed for the parallel estimation of Copy Number Variants (CNVs), although with limited accuracy.
Objectives To improve the genotyping of SNPs and CNVs from GWAS studies using the Illumina platform, we have developed CNStream2 software.
Methods Several improvements have been included to the original version of CNstream software1. For CNV analysis, CNstream2 starts by combining the information of multiple samples to assign a probability relative to the number of copies. The accuracy of the call is then increased by combing the information of nearby probes. In this new version, several key aspects of the analysis algorithms have been optimized to increase the sensitivity, accuracy and computational speed of the method. We have used publicly available HapMap data on Illumina microarrays to benchmark our method in comparison to other established methods.
Results The changes introduced in the new software version have substantially increased the accuracy and sensitivity of CNV calls. Importantly, it shows a superior call rate compared to the most commonly used SNP genotyping programs like GeneStudio, thereby, increasing the power to detect significant associations. CNStream 2 has now an improved computational performance and is now able to compute SNP and CNV genotypes in 10x less time (i.e. genotyping of 600,000 markers in 5,000 individuals isestimated in ∼10 hours compared to >100 hours in the previous version). Our results with Hapmap samples confirm the increase in sensitivity and accuracy in SNP and CNV call in any of the Illumina genotyping microarray versions compared to other established methods.
Conclusions CNStream2 is a powerful tool for those biomedical researchers conducting GWAS with the Illumina platform that want to increase the statistical power of their studies.
Alonso A, Julia A, Tortosa R, Canaleta, C., Canete, J. D., Ballina, J. Balsa, A., Tornero, J. Marsal, S et al. CNstream: A method for the identification and genotyping of copy number polymorphisms using Illumina microarrays. BMC Bioinformatics 2010;11:264.
Disclosure of Interest None Declared