« Home « Kết quả tìm kiếm

Unexpected diversity of CRISPR unveils some evolutionary patterns of repeated sequences in Mycobacterium tuberculosis


Tóm tắt Xem thử

- Background: Diversity of the CRISPR locus of Mycobacterium tuberculosis complex has been studied since 1997 for molecular epidemiology purposes.
- In contrast, we found an unexpected diversity in the form of: SNPs in spacers and in Direct Repeats, duplications of various length, and insertions at various locations of the IS 6110 insertion sequence, as well as blocks of DVR deletions.
- When reconstructing evolutionary steps of the locus, we found no evidence for SNP reversal.
- DVR deletions were linked to recombination between IS 6110 insertions or between Direct Repeats..
- In the last 5 years.
- however, popularity of most repeated sequences has de- creased first because they are larger than reads provided by Short Reads Sequencing, and second because of the generalization of Whole-Genome-Sequence availability and use of softwares analyzing Single Nucleotide Polymor- phisms (SNPs) [3 – 5].
- 1 Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Univ.
- Full list of author information is available at the end of the article.
- They all share the same mechanism of spacer acquisi- tion, inserting part of a foreign sequence designated as protospacer, with a length similar to that of the repeats, next to the 5′ end of the locus.
- Partial analysis of the CRISPR diversity has been used since 1997 to explore the clinical isolates relatedness through a technique coined as « spoligotyping » [29].
- Spoligotyping has led to the set-up of the first worldwide database for this pathogen counting today more than 111,000 patterns originating from 169 countries [13, 14].
- External DR0s are bordered by specific sequences, one of 48 bp in length at the beginning of the locus, after Cas2 , one of 148 bp at the end of the locus, before Rv2813 (Supplementary file 1).
- These se- quences are found in all isolates except in the case of large deletions (Supplementary file 2[IS 6110 sheet.
- Most of the time, the CRISPR-Cas locus includes one IS 6110 copy as in the first isolate presented in Fig.
- The spacer sequences as well as those of the DR are always found in the same direction.
- Red empty arrows indicate IS 6110 sequences.
- grey: IS 6110 genes (transposase and hypothetical protein);.
- The color of spacers was attributed randomly to facilitate visual exploration but spacers of the same color have no link except if they carry the same number.
- When a DR0 borders a deletion, we chose to represent it in most of the cases at the beginning of the deletion, although choosing the end of the deletion would have been equally relevant.
- This in turn matches the fact that spacer 15 is half of the time followed by spacer 16 and the other half by spacer 17: in one of the two spacer 14-spacer 21 region, DVR16 has been de- leted (Fig.
- Duplications occur in tandem most of the time.
- A notable exception to the natural order of succes- sion of spacer is the case of the spacer 35, which can be found in the following two places: between 34 and 36 on the one hand, and after 41 on the other hand (Fig.
- Another important and widely representative charac- teristic of MTC CRISPR locus is the presence of the IS 6110 copy referenced in [29] and that shares the same orientation than the CRISPR, i.e.
- corresponding to a IS 6110 c (Fig.
- Regarding intra-spacer diversity, we identified 20 spacers that harbored at least two variants, and concerned 48 (24%) out of the 198 isolates explored (Supplementary file 2[spacer sheet.
- Between two spacers, we have most of the time the DR0 sequence referenced in [34].
- The number of the following spacer is shown in red crosses.
- This is compatible with a duplication of the region spacer14- spacer21, with one repetition lacking spacer 16.
- As expected due to IS 6110 insertion characteristics, the concatenation of these two sequences is 3 bp larger than DR0 since 3 add- itional cytosines are present at each end of the insertion [36, 37].
- tuberculosis phylogeny and allow to infer that the mu- tation in L1.1.1 occurred shortly after separation from the rest of the other L1 sublineages..
- Each time, the size of the DR is respected (no indel, only the single nucleotide polymorphism) except for one case where a longer DR was found (data not shown).
- Large scale variations and IS 6110 copies.
- A second set of clinical isolates (SRR5073877 and ERR552680) harbours two IS 6110 copies, respectively the well-known one in the DR.
- Interestingly, the borders of IS 6110 insertion in ERR234259 corresponded well to the external borders of the two IS present in SRR5073877 and ERR552680.
- The left border consisted in the 17 first bp of DR0 (2 bp less only than the rDRa1 in the classical position), and the right border was the exact same 33-last nucleotides of DR0 than the one found at the right of the second inser- tion in SRR5073877 and ERR552680.
- A dele- tion of DVR 54 to 61 characterized MRCA of lineages 2, 3, 4 and 7, which is not documented in the classical form of the spoligotype as these spacers are not belong- ing to its set of 43 spacers.
- Interestingly, this ancestor harbors an IS 6110 insertion in one cas gene (namely csm6 ) but not at the border of the classical Beijing deletion.
- isolates representative of the MTC excluding M..
- The proposed structure was designed by a parsimonious approach based on the CRISPR structure of the 198 clinical isolatess fully characterized in Supplementary file 3 (See also notes common with Fig.
- Other IS 6110 insertions were found along the whole MTC CRISPR locus, with up to two insertions in the CRISPR locus and three when considering the whole CRISPR-Cas locus.
- Overall, the proportion of genomes containing either several copies of IS 6110 or a duplication of one of the forms listed above is important, showing that MTC CRIS PR is much more variable than what could be derived.
- This is true not only for the in vitro but also for the in Silico- based acquisition of the spoligotype, as the blast procedure used in the current analytic tools (Spolpred, SpoTyping) only provides information on the presence or absence of a given spacer: there is nothing quantitative or location- related in these approaches [41, 42].
- Hence, on one hand, the representation of the CRISPR locus through a simple barcode of presence/absence of individual spacers hides these quantitative and localization information, whereas on another hand, a more extensive description of the CRISPR locus including duplications, insertions, point mutations, provides useful information to classify and/or cluster clinical isolates.
- Combined mechanism of CRISPR locus reduction: how does IS 6110 contributes to the evolution of CRISPR locus in MTC?.
- In addition to the undeniable expansion mechanisms mentioned above, CRISPR reduction mechanisms also coexist, which -to some extent- explain some of the spa- cer block deletions in MTC spoligotypes..
- In place, it harbors a one nucleotide variant of the beginning sequence, a DR0 and spacer 2..
- in csm6 in the ancestor of L2, also seen in SRR1710060, see Sup- plementary file 2), and (3) recombination between the two IS 6110 copies.
- it happens independently of the lineage and is respon- sible of convergence in IS6110 copy numbers [44].
- The final result is the change from x to x-1 copies of IS 6110 , with the loss of all spacers between the two copies.
- For instance, in many L4.3 (LAM) clinical isolates where spacers 31 to are missing, the successive sequences of interest are: the beginning of spacer 31 (#21), an IS 6110 c, DRb1 and spacer 35.
- This suggests that an IS 6110 copy was first inserted at the end of spacer 31, and that it later recombined with the one located between spacers 34 and 35.
- The orientation of the two IS 6110 copies that recom- bined cannot always be derived due to the lack of the ancestral versions.
- This is true for the IS 6110 insertions having led to the deletion described in Fig.
- In that case, both insertions were in the reverse sense as compared to H37Rv orientation and can be called IS 6110 c.
- This phenomenon was recently observed in several cases of IS 6110 mediated deletions in L2 [45]..
- How does the CRISPR sequence diversity impact spoli- gotyping data? When performed in vitro, spoligotyping consists first in the amplification of the CRISPR locus using primers facing the outside of DR region, referred to as DRa and DRb, and second in the hybridization to probes attached at a specific position on a membrane or another support.
- CRISPR sequences variants may reduce the efficiency of the process, whether at the amplifica- tion or at the hybridization step.
- To explore and understand the consequences of this possibility, it is important to identify the orientation of the CRISPR locus in question.
- According to classical CRISPR expansion mechanism, the introduction of new spacers occurs at the 5′ end of the locus, so that the most ancient DVR lies at its 3′ end..
- Yet, the presence of mutations in several DR at the 3′ end of the locus could also play a role in its stability..
- frequent to have a loss of the beginning sequences of CRISPR, on the side of the cas genes (several independ- ent isolates from L2 and from L4) than to have a loss of the ending sequences, i.e.
- All de- letions implicating flanking sequences were bordered by an IS 6110 sequence.
- Altogether, the asymmetry in dele- tion suggests either a more crucial role of the end of the CRISPR i.e.
- However, our script was designed to look only for insertion in cas gene that also lead to a deletion in the CRISPR in at least one of the explored sample..
- CRISPR-Cas loci are involved in two mechanisms: 1) adaptation by the integration of new spacers, usually taken from foreign DNA, at the 5′ end of CRISPR with the help of Cas1 and Cas2 proteins, and 2) immunity by the transcription of CRISPR locus, processing with the help of Cas6 protein in the case of type III-A CRISPRs, and degradation of DNA and/or RNA carrying protospa- cers , with the help of the crRNP (CRISPR RiboNucleo- Protein complex), a complex involving the crRNA and other Cas proteins.
- In the whole M.
- These two phenomena could also be linked: a loss of functionality of Cas1 and Cas2 in the MRCA of all MTC could have fostered an adaptative change in life- style of the bacterium, i.e.
- Such an hypothesis could be supported by the evolution of the CRISPR locus of Vibrio cholerae, with observations that the recent pandemic strains have lost their ancestral CRIS PR locus [53] and (FX Weill, personal communication.
- As stated previously lacked at least part of the cas genes.
- Cas10/Csm1 and Csm3 are the enzymes responsible for the catalytic activ- ity of the crRNP [54, 55].
- Hence, regarding immunity, even if the spatial structure of the crRNP may be im- paired by the absence of csm4 and/or csm5 in some iso- lates, it could remain possible that immunity occurs in all MTC isolates through the consecutive actions of Cas6 to process pre-crRNA and of Cas10/Csm1 and Csm3 to degrade DNA and/or RNA.
- The fact that none of the spacer is conserved in all isolates implies that, if immunity occurs, it does not always target the same DNA and/or RNA sequences..
- but IS 6110 is the one with the largest number of copies in most isolates and especially in the reference isolate H37Rv [57].
- IS 6110 - RFLP was the golden standard to define epidemiological clusters at the end of the nineties and stayed so during around 20 years, until it was replaced by MIRU-VNTR 1 and more recently by Whole-Genome-Sequencing for a recent review on evolution of TB molecular.
- Previous results on IS 6110 insertion sites have shown that independent IS 6110 copy acquisition through transposition into hot- spots was a common mechanism explaining convergence in IS 6110 copy number in some of the MTBC sublineages [44, 62].
- The role of the ipl (Insertion Preference Locus) was also stressed long time ago and showed consequences on the CRISPR locus however no generalized observations on IS-CRISPR genomics dynamics had been done so far before this study..
- Our study, by providing an in-depth reconstruction of the CRISPR locus of MTC in combination with IS 6110 using short reads on around 200 genomes, improves our know- ledge on the structure of the CRISPR locus and sheds new light on the general evolutionary mechanisms acting on MTC genomes through a first yet quantitatively limited analysis that combines CRISPR-IS combined evolutionary dynamics.
- By unveiling an unexpected genetic diversity of the CRISPR Locus on MTC, our study opens the way to new in-depth congruence analysis between SNP-based and repetitive sequence based MTC phylogenies.
- We then looked for spacer variants by searching for patterns made up of the last 12 nucleotides of DR0 [29], followed by 10 to 70 bp, followed by the first 12 bp of.
- 1) the beginning and end sequences of IS 6110 and its reverse complement (40 bp each time);.
- 5) sequences in the neighbouring genes ( Cas or others) when these sequences were found besides an IS 6110 se- quence during reconstruction –see below- (for more de- tails.
- github.com/cguyeux/CRISPRbuilder-TB) was set up to re- construct large fragments of the CRISPR.
- A specific search for duplications was included looking for patterns of the form sp.(l)*DRX*sp.(m), where l ≥ m (for more details see [35]..
- Final reconstruction taking into account IS 6110 inser- tions was performed manually.
- In some samples, contig reconstruction was confirmed by retrieving the identity of the spacer downstream the last spacer of a duplica- tion.
- When one side of the CRISPR could not be auto- matically recovered for instance due to an IS 6110 insertion with a single end found in the catalog of CRIS PR locus sequences, a stepwise manual search for the neighbouring sequences was performed until recovery of the other IS 6110 end.
- undergraduates students who contributed to the start of the MTC CRISPR genome projects in the team, are warmly acknowledged..
- Nucleotide sequence of the iap gene, responsible for alkaline phosphatase isozyme conversion in Escherichia coli, and identification of the gene product.
- Macro-geographical specificities of the prevailing tuberculosis epidemic as seen through SITVIT2, an updated version of the Mycobacterium tuberculosis genotyping database.
- Significance of the identification in the horn of Africa of an exceptionally deep branching Mycobacterium tuberculosis clade.
- et al .
- An ancestral lineage of the.
- Genetic variation and evolutionary origin of the direct repeat locus of Mycobacterium tuberculosis complex bacteria.
- Exhaustive reconstruction of the CRISPR locus in M.
- IS 6110 , an IS-like element of Mycobacterium tuberculosis complex.
- A novel pathogenic taxon of the Mycobacterium tuberculosis complex, Canetti: characterization of an exceptional isolate from Africa.
- The use of microbead-based spoligotyping for Mycobacterium tuberculosis complex to evaluate the quality of the conventional method: providing guidelines for quality assurance when working on membranes.
- Genomic history of the seventh pandemic of cholera in Africa.
- Characterization of IS1547, a new member of the IS900 family in the Mycobacterium tuberculosis complex, and its association with IS6110..
- Evolutionary relationships amongst isolates of Mycobacterium tuberculosis with few copies of IS 6110

Xem thử không khả dụng, vui lòng xem tại trang nguồn
hoặc xem Tóm tắt