« Home « Kết quả tìm kiếm

MONTAGE: A new tool for high-throughput detection of mosaic copy number variation


Tóm tắt Xem thử

- MONTAGE: a new tool for high-throughput detection of mosaic copy number variation.
- Mosaicism is a prevalent and impactful class of non-integer state copy number variation (CNV).
- Mosaicism implies that certain cell types or subset of cells contain a CNV in a segment of the genome while other cells in the same individual do not.
- Results: We developed a tool called Montage to improve the accuracy of detection of mosaic copy number variants in a high throughput fashion.
- We additionally investigated the allele imbalance observations genome-wide to define non- diploid and non-integer copy number states..
- Conclusions: Our novel algorithm presents an efficient tool with fast computational runtime and high levels of accuracy of mosaic CNV detection.
- A curated mosaic CNV callset of 3716 events in 2269 samples is presented with comparability to previous reports and disease phenotype associations.
- Keywords: Mosaicism, Mosaic, Copy number variation, Genomics.
- Mosaic CNV creation mechanisms include: chromosome nondisjunction, anaphase lag, and endoreplication..
- Mosaic CNV detection is important in clin- ical settings for accurate assessment and estimate of dis- ease recurrence risk [2–5]..
- The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material.
- If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
- To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/..
- The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data..
- 2 Department of Pediatrics, Perelman School of Medicine, University of Pennsylvania, 3400 Civic Center Blvd, Philadelphia, PA 19104, USA Full list of author information is available at the end of the article.
- The goal of the algorithm was to deliver results in a high-throughput way with few spurious calls and to successfully detect mosaic events multiple mega- bases in size.
- The detected mosaic regions are further classified as copy-loss, copy-gain, or copy-neutral events based on the alteration of the LRR from baseline.
- SNP Name with associated chromosome and base pair position can be specified separately or combined in the input.
- If sorting is detected to be done already, only the mosaic CNV detection portion of the code runs taking 10 s per sample (Table 1)..
- Table 1 Performance Comparison of Mosaic CNV Detection Tools.
- In parenthesis is Observed / Expected mosaic CNV calls.
- A bash awk statement embedded in the Perl code efficiently performs a sliding window of 1 MB with 1 MB increments to roughly assess potential re- gions of mosaicism.
- We record the first and last base pair position of mosaic evidence in these intervals to provide specific breakpoints (Fig.
- Efficient and minimal dependency coding allows for rapid ease of deployment of the software..
- The BAF ranges used are tallied in the following inter- vals .
- This presents the key command in the script.
- We dynamically assess the column header to determine the presence and order of SNP Name, Chromosome, Position, B Allele Freq, and Log R Ratio column data in the user provided input files.
- Mosaic deletions have clearing of the 0.5 AB range and relatively equal banding of AAB and ABB range genotypes reflected by BAF.
- Using an ultra-efficient awk bash command, we are able to run an optimal non-overlapping sliding window algo- rithm to determine BAF in the mosaic deletion indica- tive ranges of and as well as the standard deviation of these observations to determine clarity (lack of noise) in the signal observed in a given sample.
- PennCNV (version 1.0.4) component script clean_cnv was used to combine segments in close proximity into one merged mosaic CNV call.
- Record the first and last base pair position of mosaic evidence in these intervals to provide specific breakpoints (Fig.
- We focus our code in the extensible Perl and Bash programming languages as opposed to RGADA-MAD which is written in R.
- We assessed the performance of existing mosaic CNV detection algorithms (Table 1).
- We evaluated mosaic events in 367,785 samples and found 3716 putative mosaic events in 2269 individuals with 2/3 of the raw mosaic calls being visually validated..
- In total, 187,096 mosaic CNV candidates were suggested by the first pass screening of our algorithm applied to approximately 350,000 SNP-array data sets.
- The first quartile minus 1.5 interquartile range of the LRR paired with strong BAF deviation defines the calling threshold for mosaic deletion events.
- Filtering of the putative mosaic CNV calls and respective size of the curated callset at each step.
- Further visualization of BAF/LRR underlying potential mosaic CNV calls was conducted manually by a human expert reviewer (in the case only 1 mosaic call in the sample) or by DeepCNV algorithm (in the case 2 or more mosaic calls in.
- Full deletion of chromosome 1p contrasted by duplication of 1q is shown along with mosaic deletions of high proportion of cells in the person ’ s sample on 2q, 8q, and 11p.
- filtered out mosaic CNV candidates with overlapping homozygous deletion calls as detected by PennCNV, since the random noise in BAF for real homozygous de- letions can give a false indication of aberrant BAF band- ing, leaving us with 126,020 mosaic CNV candidates in 43,781 samples.
- There were 51,326 mosaic CNV candidates with at least one mosaic CNV candidate >.
- Finally, 19,090 mosaic CNV candidates had strictly one mosaic CNV candidate >.
- 3 Mb, suggesting high specifi- city of mosaic CNV detection in these samples (Fig.
- Therefore, we set forth visualizing the underlying BAF and LRR profiles corresponding to these mosaic CNV candidates..
- 5) level mosaic CNV events with high sensitivity and specificity.
- We identified 273 putative mosaic CNV deletions in 76 out of 228 samples analyzed.
- Of those, 202 visually validated as true positive, confirming mosaic CNV deletions in 50 samples out of 228 samples.
- Ap- proximately half of the mosaic calls had non-zero AB(0.4–0.6 BAF) signal indicating noise and or lower levels of mosaicism..
- In order to validate the rest of the mosaic CNV candi- dates, we used a machine learning approach we devel- oped called DeepCNV (in review).
- DeepCNV is based on a trained model of positive and negative mosaic CNV examples based on a human expert’s labeling.
- Using this model and images of LRR/BAF plots from PennCNV visualize_cnv which are standard and popularly used, probabilistic predictions of the mosaic CNV candidate being a true positive are output.
- We compared our observed mosaic CNV counts and frequencies to previous studies of mosaicism and found high concordance in genomic regions and their corre- sponding frequencies observed in populations (Fig.
- In addition, when examined in the context of multiple dis- ease phenotypes that these individuals harbored, several disease categories were associated with mosaic CNVs based on results generated using the ParseCNV software (Table 2).
- Mosaic deletions of low proportion of cells in the person ’ s sample on 3p, 10q, 11q, and 14q.
- Using a majority voting scheme, many more mosaic CNV calls overlap MONTAGE than MoCha or RGADA-MAD.
- To compute specificity using this similar majority-vote approach, we need to know the size of the background in the background, namely, the number of samples that are considered as negatives by all three callers, which is not well defined..
- e.g., we would only expect a higher-than-estimated FDR to weaken GWAS associations and decrease effect sizes.) 63% of the mosaic events identified were found in males compared to the 50% male percentage in the input dataset Fisher ’ s exact test (2-Tail) p = 1.554e-11 (Supple- mentary Table 2).
- Blue line represents the current work from Montage mosaic CNV callset.
- mosaic CNV callset.
- Black line represents the other previously published mosaic CNV callsets.
- Mosaic CNVs of intermediate states between integer copy number variation are important genetic/genomic events in both clinical and research settings.
- However, detection of these mosaic events has been limited to in- cidental findings from CNV algorithms designed for in- teger discrete copy numbers and not the continuous nature of mosaic CNVs.
- MAD was notably used in The Cancer Genome Atlas (TCGA) mosaic CNV analysis [19].
- Broad disease category association study of detected and curated mosaic CNV events to implicate genomic loci for disease phenotypes.
- In light of shortage of high performance tools, we de- signed a new mosaic CNV detection tool aimed at pro- viding high sensitivity and specificity mosaic CNV detection and fast runtime.
- Our analysis concurs with this, showing that 63% of the mosaic events identified were found in males..
- This mosaic CNV detection work has implications in cancer, cell free fetal DNA, and aging [20, 21].
- Tumor- normal heterogeneity can appear similarly to germline mosaic CNV.
- Therefore, cancer phenotyping records are important in conditioning the assessment of supposed mosaic CNV callsets.
- Cell free fetal DNA is another ap- plication that such mosaic CNV detection and associ- ation presented here could be of utility.
- Prenatal testing could be enhanced by deconvolution of the maternal and child CNV genotype profile.
- To successfully diagnose mosaic CNVs, it’s important to de- velop targeted detection tools and systematically apply them to large cohorts to truly understand its relevance and frequency of mosaic CNVs in the general popula- tion.
- Here we demonstrate the utility of our fast scalable tool, MONTAGE, specifically designed for mosaic CNV detection.
- We envision MONTAGE being an integral part to include for future mosaic CNV detection and analysis..
- org/10.1186/s .
- Modeling B-allele Frequency Standard Deviation for Mosaic Copy Number States.
- CNV: Copy number variation.
- We thank the study participants who allowed for the use of genotyping, sequencing and disease phenotype data for this study, and to testers of the codes used in this study..
- The funders had no role in this study including the design of the study and collection, analysis, and interpretation of data and in writing the manuscript..
- The MONTAGE algorithm provides a competitive tool to be used for mosaic CNV detection.
- The Institutional Review Board of The Children ’ s Hospital of Philadelphia approved this study.
- https://doi.org/10.1093/hmg/ddq003 Epub 2010 Jan 6..
- https://doi.org/10.1002/ajmg.a.37261 Epub 2015 Jul 21..
- https://doi.org/10.1002/ajmg.c..
- Sensitive and specific detection of mosaic chromosomal abnormalities using the parent-of-origin- based detection (POD) method.
- https://doi.org .
- https://doi.org/10.1186/.
- https://doi.org/10.1038/s x Epub 2018 Jul 11..
- ParseCNV integrative copy number variation association software with quality tracking.
- doi.org/10.1093/nar/gks1346..
- https://doi.org/10.1093/hmg/ddv033 Epub 2015 Jan 29..
- Mosaic copy number variation in human neurons.
- Copy number variation and mosaicism..
- https://doi.org/10.1159/.
- doi: https://doi.org/https://doi.org/.
- https://doi.org/10.1038/s .
- Detection and quantitation of chromosomal mosaicism in human blastocysts using copy number variation sequencing.
- https://doi.org/10.1002/pd..
- https://doi.org/10.1016/j.celrep .
- https://doi.org/10.1111/cge.12502 Epub 2014 Oct 7.

Xem thử không khả dụng, vui lòng xem tại trang nguồn
hoặc xem Tóm tắt