« Home « Kết quả tìm kiếm

Concordance rate between copy number variants detected using either high- or medium-density single nucleotide polymorphism genotype panels and the potential of imputing copy number variants from flanking high density single nucleotide polymorphism haplotypes in cattle


Tóm tắt Xem thử

- Concordance rate between copy number variants detected using either high- or medium-density single nucleotide.
- polymorphism genotype panels and the potential of imputing copy number variants from flanking high density single nucleotide polymorphism haplotypes in cattle.
- The main aim here was to determine if it is possible to impute copy number variants (CNVs) using the flanking single nucleotide polymorphism (SNP) haplotype structure in cattle.
- The concordance between CNVs called from the medium-density and high-density genotypes were calculated separately for each animal.
- A subset of CNVs which were called from the high-density genotypes was selected for imputation.
- A CNV was deemed to be imputed correctly when the called copy number matched the imputed copy number..
- Full list of author information is available at the end of the article.
- Results: For 97.0% of CNVs called from the high-density genotypes, the corresponding genomic position on the medium-density of the animal did not contain a called CNV.
- The average accuracy of imputation of the CNV normal state, i.e.
- Conclusion: The vast majority of CNVs called from the high-density genotypes were not detected using the medium-density genotypes.
- A copy number variant (CNV) is a form of genetic variation that arises from a deletion or duplication of a stretch of DNA [1].
- Copy number variants are a common feature of the bovine genome, with the aver- age number of CNVs per individual, identified from high- density genotype data, ranging from 18 to 51 [3 – 5].
- The objective of the present study was to quantify the accuracy of imputing CNVs detected using CNV calling algorithms from the haplotypes of flanking high density SNPs in cattle.
- Comparison of CNVs called from the high-density and medium-density genotypes.
- The median number of CNVs per animal called from the medium-density and high-density genotypes were 2 and 27, respectively.
- For 97.0% of the CNVs called from the high-density genotypes, a CNV was not detected in the same genomic region of the same animal using the medium-density genotype.
- For 87.4% of the CNVs called from the high-density geno- types, the same genomic region on the medium-density genotype had less than 3 SNPs.
- in all cases, the imputed copy num- ber did not match the called copy number.
- The relationship between the accuracy of imputation and the population fre- quency of the CNV, and the relationship between the ac- curacy of imputation and the genomic length of the CNV is in Figs.
- neither of the correlations dif- fered from zero for any of the three breeds.
- In the Charolais, and all three breeds com- bined, there was no difference in the Bayes factor between CNVs where the called and imputed copy number matched versus CNVs where the imputed and called copy number did not match..
- In addition to the imputation accuracy, the adjusted Rand Index was calculated separately for each breed to quantify the agreement between the called copy number and the imputed copy number of a CNV.
- The adjusted Rand index was 0.524 for Charolais, 0.361 for the Lim- ousins, and 0.285 for the Holsteins-Friesians meaning there was more similarity between the called copy num- ber and the imputed copy number of the CNVs than was expected by chance, albeit not a very strong agree- ment, given the maximum value the adjusted Rand index can take is 1..
- thus CNVs are likely to contribute to some of the underlying genetic variability.
- The CNVs called from the high-density genotypes are grouped separately based on the degree of overlap of the genomic position of the CNVs called from the high and medium density genotypes.
- Table 2 The first quartile, median, and third quartile of the accuracy of imputation of CNVs grouped by called copy number and breed.
- Summary statistics for duplications ( n = 4) were not included because for each duplication the imputed copy number did not match the called copy number.
- The present study is the first such in cattle to directly compare CNVs called from medium-density and high- density genotypes in the same animals.
- for 84.7% of CNVs called from the high-density genotypes, the same genomic region of the CNV on the medium-density genotype panel had less than three SNPs.
- 50,000 SNPs) have reported be- tween 1 and 7 CNVs per animal [25, 26], which is con- sistent with the results of the present study..
- Furthermore, there was a positive relationship between the length of the run of homozygosity identified from the high-density genotypes and the probability of.
- 1 Scatter plot of the percentage imputation accuracy against the percentage population frequency of each CNV.
- A CNV was deemed to be correctly imputed when the called copy number matched the imputed copy number.
- overlap with a run of homozygosity identified from the medium-density genotype in the same animal [29].
- This pattern of overlap is analogous to the pattern of overlap observed in the present study for CNVs called from the medium-density and high-density genotypes..
- The median number of CNVs called per animal from the medium-density genotype in the present study was 2, but it was 27 for the high-density genotypes.
- given that the false positive rate of CNVs called from PennCNV and QuantiSNP is reported to be it suggests that most of the CNVs called from the high-density genotype panel are in fact true CNVs..
- It may be the case that many of the additional CNVs called from whole genome sequence are true CNVs that cannot or are unlikely to be called from.
- Table 3 The location and population frequency of CNVs with an accuracy of at least 85% within at least one of the three breeds..
- high-density SNP data.
- One of the selection criteria for including SNPs on a genotype panel is high genotyping accuracy [31].
- Therefore, genomic regions that are fre- quently subject to copy number variation may be poorly represented by SNPs on genotype panels..
- [33] used Beagle V4.0 to impute CNV duplications called from whole genome sequence in 849 people sequenced as part of the 1000 Genomes Project.
- [33] reported that the correlation be- tween the actual copy number and the imputed copy number of a CNV was uniformly distributed between 0 and 100% with an average accuracy of approximately 50%.
- [35] developed the polyHap 2.0 software pack- age to impute the copy number of SNPs from genotype data.
- [35] deemed the copy number of a SNP to be correctly imputed when the called copy number matched the imputed copy number.
- [33], the validation populations contained only the genotype data of the flanking SNPs/nucleotides.
- which the copy number and genotypes of the flanking SNPs was actually known.
- [35] im- puted to a validation population in which the copy num- ber state of the flanking SNPs was known, it is expected that imputation would be more accurate than if the copy number of the SNPs in the validation population was not known.
- This is because a CNV is a continuous stretch of DNA that displays a gain or loss in copy number and therefore the copy number of an individual SNP can often be inferred from the copy number of its flanking SNPs..
- The average accuracy of imputation for the deletion CNVs in the present study was 28.6%, meaning that across all animals with a called deletion CNV, the called copy number matched the imputed copy number in only 28.6% of cases.
- For all 4 duplication CNVs examined, the imputed copy number never matched the called copy number.
- Furthermore, across all three breeds, the Bayes fac- tor of CNVs was not different between the CNVs whose called copy number matched the imputed copy number and the CNVs whose called and imputed copy number did not match.
- one of the selection criteria for SNPs to be included on a genotype panel is high minor allele frequency (MAF .
- Successful imputation of CNVs from SNP genotype data may re- quire the use of SNPs which have a MAF similar to the MAF of the CNVs to be imputed..
- Where it is known that a CNV is associated with, or contributes to a phenotype, that re- gion of the genome should be more densely populated with SNPs on a SNP genotype panel enabling improved accuracy in the identification of CNVs associated with production in cattle.
- The position of the SNPs in the BovineHD BeadChip genotype panel was based on the UMD 3.1 build of the bovine genome [40]..
- The LRR of a SNP is the log of the observed probe hybridization intensity.
- it is a measure of the fluorescence intensity produced by hybridization of a probe to a SNP array.
- this applied to CNVs called from both the high-density and the medium-density geno- types separately.
- The GC content of the genome was calculated from the UMD_3.1.1 / bosTau8 genome, complied as of June 2014..
- Comparison of CNVs from high-density and medium- density SNP genotypes.
- The medium-density genotype panel used in the present study contained 45,677 SNPs.
- Copy number variants were called from the high-density genotypes of each animal in the population using both PennCNV and QuantiSNP.
- Copy number variant imputation.
- Copy number variant imputation from SNP genotype data.
- The reason for selecting CNVs which were present in at least 30 animals within breed was to avoid small sample bias when comparing the imputed copy number of the CNV to the called copy number of the CNV..
- the actual position chosen for the variant was the midpoint of the CNV.
- Imputation was performed separately with and 500 SNPs flanking each side of the midpoint of the CNV for both FImpute and Beagle..
- The SNPs used for imputation flanked the midpoint of the CNV.
- A CNV was deemed to be correctly imputed when the copy number of the imputed CNV matched the copy number of the called CNV.
- The imputation accuracy was calculated separ- ately for each copy number as called by PennCNV and QuantiSNP.
- The adjusted Rand index [45] was used to as- sess the agreement between the called copy number of the CNVs and the imputed copy number of the CNVs.
- The Pearson correlation coefficient was used to calculate the correlation between the accuracy of imputation and the population frequency of the CNV, as well as between the accuracy of imputation and the genomic length of the CNV.
- An ANOVA analysis was used to determine if there was a difference in the Bayes factor between CNVs where the called and imputed copy number matched, and CNVs where the called and imputed copy number did not match..
- Scatter plot of the percentage imputation accuracy against the percentage population frequency of each CNV.
- A CNV was deemed to be correctly imputed when the called copy number.
- matched the imputed copy number.
- The first quartile, median, and third quartile of the accuracy of imputation of CNVs grouped by called copy number and breed.
- Summary statistics for duplications ( n = 4) were not included because for each duplication the imputed copy number did not match the called copy number..
- The location and population frequency of CNVs with an accuracy of at least 85% within at least one of the three breeds.
- CNV: copy number variant.
- Fine mapping of copy number variations on two cattle genome assemblies using high density SNP array.
- Genome-wide detection of copy number variations using high-density SNP genotyping platforms in Holsteins.
- Copy number variation in livestock: a mini review.
- PennCNV: an integrated hidden Markov model designed for high resolution copy number variation detection in whole-genome SNP genotyping data..
- QuantiSNP: an objective Bayes hidden-Markov model to detect and accurately map copy number variation using SNP genotyping data.
- Copy number.
- A genome-wide association study of copy number variations with umbilical hernia in swine.
- FCGR3B copy number variation is associated with susceptibility to systemic, but not organ specific, autoimmunity.
- Genome-wide identification of copy number variation using high-density single nucleotide polymorphism array in Japanese black cattle.
- Characterisation of copy number variants in a large multi- breed population of beef and dairy cattle high-density single nucleotide polymorphism genotype data.
- Comparative analyses of seven algorithms for copy number variant identification from single nucleotide polymorphism arrays.
- Comprehensive assessment of array-based platforms and calling algorithms for detection of copy number variants.
- Genomic characteristics of cattle copy number variations.
- Genomic population structure and prevalence of copy number variations in south African Nguni catle.
- A genome- wide scan for copy number variations using high-density single nucleotide polymorphism array in Simmental cattle.
- A large interactive visual database of copy number variants discovered in taurine cattle.
- Large multiallelic copy number variations in humans.
- A whole-genome assembly of the domestic cow, Bos Taurus

Xem thử không khả dụng, vui lòng xem tại trang nguồn
hoặc xem Tóm tắt