« Home « Kết quả tìm kiếm

Improving CRISPR guide design with consensus approaches


Tóm tắt Xem thử

- Bradford and Perrin BMC Genomics 2019, 20(Suppl 9):931 https://doi.org/10.1186/s z.
- Improving CRISPR guide design with consensus approaches.
- 10–12 September 2019.
- Results: We considered nine leading guide design tools, and their output when tested using two sets of guides for which experimental validation data is available.
- The best performance (with a precision of up to 0.912) was obtained when combining four of the tools and accepting all guides selected by at least three of them..
- site-specific cleavage through homology between the guide and the DNA sequence of the invading phage..
- However, guide design is not trivial.
- For this reason, computational techniques have been developed to identify and evaluate candidate CRISPR-Cas9 guides..
- In a benchmark of the leading guide design tools, we previously noted the limited overlap between the guides that each tool selects [6].
- To answer this question, we analysed the output of nine distinct guide design tools on experimental data and investigated whether the consensus between some or all of the tools would lead to a better set of guides..
- 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0.
- The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated..
- When testing on the Wang dataset and seeking a recall of 0.2, CHOPCHOP achieved the highest precision: 0.843..
- When seeking a recall of at least 0.5, sgRNAScorer2 achieved the highest precision on this dataset: 0.833.
- When testing on the Doench dataset, CHOPCHOP again achieved the best precision for a recall of 0.2, at 0.294.
- When seeking a recall of at least 0.5, SSC achieved the highest precision, at 0.277.
- The distribution of guides accepted and rejected by each tool are shown in Fig.
- The only exception was SSC on the Doench dataset.
- This also improves its performance on the Wang dataset, but SSC uses that dataset for training so it is not.
- For a recall above 0.5, the optimal threshold for SSC was 0.2, for a precision of 0.300.
- The most intuitive way to combine results from separate tools was to only accept guides that have been selected by at least n tools..
- This means that, for, when testing on the Wang dataset, the set considered for the consensus includes: Cas-Designer, WU-CRISPR, FlashFry, sgRNAScorer2, CHOPCHOP, CHOPCHOP- MM, TUSCAN, PhytoCRISP-Ex and mm10db.
- When testing on the Doench dataset, the set includes: Cas- Designer, sgRNAScorer2, CHOPCHOP, CHOPCHOP- Xu, CHOPCHOP-MM, PhytoCRISP-Ex and mm10db..
- The results are shown in Table 2.
- However, a strict intersection of the results from each tool would not be practical: on both datasets, only a handful are identified by all tools.
- At the other end of the spectrum (i.e.
- If a recall of at least 0.2 is appropriate, the best results on the Wang dataset were obtained for n = 5, with.
- PhytoCRISP-Ex .
- 1 Results for individual tools on the Wang dataset.
- a precision of 0.911.
- In contexts where a higher recall is needed (0.5), a precision of 0.811 can be achieved with n = 3..
- On the Doench dataset, for a recall of 0.2, a precision of 0.282 was achieved with n = 4.
- For a recall of 0.5, a precision of 0.244 was achieved with n = 3..
- 2 Results for individual tools on the Doench dataset.
- Table 2 Consensus when removing models trained on the associated test dataset.
- Tools considered for Wang: Cas-Designer, WU-CRISPR, FlashFry, sgRNAScorer2, CHOPCHOP, CHOPCHOP-MM, TUSCAN, PhytoCRISP-Ex and mm10db.
- Tools considered for Doench: Cas-Designer, sgRNAScorer2, CHOPCHOP, CHOPCHOP-Xu, CHOPCHOP-MM, PhytoCRISP-Ex and mm10db.
- The tools used for the consen- sus are then Cas-Designer, sgRNAScorer2, CHOPCHOP, CHOPCHOP-MM, PhytoCRISP-Ex and mm10db.
- The distribution of guides are shown in Figs.
- The results on the consen- sus of procedural methods are shown in Table 4, Figs.
- For ML methods, we followed the same strategy as above, and removed tools trained on the data used in our tests.
- Given a recall of at least 0.2, the approach had a precision of 0.881 when n = 3.
- For a recall of at least 0.5, the approach had a precision of 0.793 when n = 2..
- With n = 4, it is possible to reach a precision of 0.290, but the recall is only 0.173..
- Based on the earlier results, we tried to identify the best set of tools to use for consensus, with only the same two constraints as above: the tool should not have been trained on the dataset used for testing, and it.
- Table 3 Consensus: accepting guides selected by at least n tools (except those models trained on the test data and poor performing tools).
- Tools considered here: Cas-Designer, sgRNAScorer2, CHOPCHOP, CHOPCHOP-MM, PhytoCRISP-Ex and mm10db.
- 3 Consensus, on the Wang dataset, when accepting guides selected by at least n tools (except those models trained on any of the test data and poor performing tools): Cas-Designer, sgRNAScorer2, CHOPCHOP, CHOPCHOP-MM, PhytoCRISP-Ex, mm10db.
- 4 Consensus, on the Doench dataset, when accepting guides selected by at least n tools (except those models trained on any of the test data and poor performing tools): Cas-Designer, sgRNAScorer2, CHOPCHOP, CHOPCHOP-MM, PhytoCRISP-Ex, mm10db.
- Cas-Designer, CHOPCHOP, PhytoCRISP-Ex, mm10db.
- should have completed at least two tests in the bench- mark.
- If accepting guides selected by at least three of these four tools, we obtained a precision of 0.912 (recall 0.185) and 0.356 (recall 0.216) for Wang and Doench, respectively.
- One limitation is that this approach is using two of the slowest tools (sgRNAScorer2 and PhytoCRISP-Ex), as per our earlier benchmark [6].
- It is possible to be computa- tionally more efficient by excluding PhytoCRISP-Ex, at a cost in terms of precision, but still outperforms individual tools: 0.857 for Wang (recall 0.360) and 0.293 for Doench (recall 0.453), with n = 2..
- 5 Consensus, on the Wang dataset, between procedural methods: Cas-Designer, CHOPCHOP, PhytoCRISP-Ex, mm10db.
- 6 Consensus, on the Doench dataset, between procedural methods: Cas-Designer, CHOPCHOP, PhytoCRISP-Ex, mm10db.
- None of the tools had a precision over 0.85 on Wang or over 0.3 on Doench.
- In particular, we previously reported that two of the tools (PhytoCRISP-Ex and sgR- NAScorer2) did not scale to exhaustive searches on large genomes [6]..
- In particular, we found that, by considering four tools (sgRNAScorer2, CHOPCHOP, PhytoCRISP-Ex and mm10db) and accepting all guides selected by at least.
- Table 5 Consensus between machine-learning methods, removing models trained on the associated test dataset.
- Methods Guide design tools.
- We previously benchmarked the leading open-source tools for guide design for the Streptococcus pyogenes-Cas9.
- 7 Consensus, on the Wang dataset, when optimising for both datasets (excluding models trained on test data, excluding poor performing tools, no more than five tools, recall approx.
- sgRNAScorer2, CHOPCHOP, PhytoCRISP-Ex, mm10db.
- 8 Consensus, on the Doench dataset, when optimising for both datasets (excluding models trained on test data, excluding poor performing tools, no more than five tools, recall approx.
- They can include considerations such as avoiding poly-thymine sequences [17], rejecting guides with inap- propriate GC-content [18], or considering the secondary structure of the guide RNA.
- Because of the different approaches taken by the developers, it can be expected that each tool would produce different guides..
- The values we used are: 0.5 for Flash- Fry, 70 for Cas-Designer, 50 for WU-CRISPR, 0.55 for CHOPCHOP-MM, and 0 for SSC, CHOPCHOP-Xu and sgRNAScorer2.
- try to change these thresholds, or to improve any of the filtering or scoring of any tool..
- 731 were deemed to be ‘efficient’ based on anal- ysis of the gene knock-outs.
- The Doench dataset contains 1841 guides from nine mouse and human transcripts, with 372 of the guides deemed to be ‘efficient’.
- CHOPCHOP and FlashFry provide each of the listed efficacy scoring models;.
- The italicised text in the ‘Tool’ column is the git hash to identify which version of the tool was used.
- We also created all the files required by any of the tools: custom annotation file (derived from the refGene table available via UCSC), 2bit compression file, Bowtie and Bowtie2 indexes, and Burrows-Wheeler Aligner file..
- The precision gives us how many guides clas- sified as efficient actually were efficient, while the recall tells us how many of the efficient guides were correctly selected.
- In this paper, we set a recall of 0.2, meaning that at approxi- mately 20% of the efficient guides are identified.
- The full contents of the supplement are available at https://bmcgenomics..
- The guide design tools that are used are all available from their respective authors (with access details shown in Table 7)..
- https://doi.org/10..
- CRISPR-Cas9 structures and mechanisms.
- https://doi.org/10.1016/j.cell .
- Genome engineering using the CRISPR-Cas9 system.
- https://doi.org/10.1038/nprot.2013.143.
- Genetic screens and functional genomics using CRISPR/Cas9 technology.
- A benchmark of computational CRISPR-Cas9 guide design methods.
- https://doi.org/10.1093/nar/gku410..
- https://doi.org/10.1101/.
- WU-CRISPR: characteristics of functional guide RNAs for the CRISPR/Cas9 system.
- Cas-Designer : A web-based tool for choice of CRISPR-Cas9 target sites.
- https://doi..
- https://doi.org/10.1016/J.CELREP .
- https://doi.org/10.1021/acssynbio.6b00343..
- https://doi.org/10.1089/.
- https://doi.org/10.1089/crispr.2017.0021..
- https://doi.org/10.1038/nmeth.3543..
- https://doi.org/10.1126/science.1237934..
- Genetic screens in human cells using the CRISPR-Cas9 system.
- https://doi.org/10.1038/nbt.3026.
- https://doi.org/10.1126/science.aau0629..
- http://science.sciencemag.org/content/363/6424/eaau0629.full.pdf.

Xem thử không khả dụng, vui lòng xem tại trang nguồn
hoặc xem Tóm tắt