« Home « Kết quả tìm kiếm

Exploring short k-mer profiles in cells and mobile elements from Archaea highlights the major influence of both the ecological niche and evolutionary history


Tóm tắt Xem thử

- Their utility in the study of the mobilome has recently emerged and they seem a priori adapted to the patchy gene distribution and the lack of universal marker genes of viruses and plasmids..
- Archaea is one of the three domains of life.
- At a finer taxonomic level, the influence of the taxonomy and the environmental constraints on 5-mer profiles was very strong.
- This enabled us to identify one previously known and one new case of recent host transfer based on the atypical composition of the mobile elements involved.
- 4 Institut Pasteur, Unité de Virologie des Archées, Département de Microbiologie, 25 Rue du Docteur Roux, 75015 Paris, France Full list of author information is available at the end of the article.
- This finding raised questions regarding the evolutionary signifi- cance of this concept and of the underlying mechanisms [4].
- In the present work, we focused specif- ically on the cells and mobile elements from Archaea, one of the three domains of life..
- We used multivariate and statis- tical analyses to explore the dataset structure and iden- tify some key structuring factors, namely, the taxonomic classification, the genomic GC content, the ecological niche and, for mobile elements, the taxonomy of the host.
- We first noticed from the dendrogram obtained by hier- archical clustering that the sequences were distributed into two main clusters according to GC content values, suggesting a major influence of the GC content on the k- mer distribution (Fig.
- In particular, all members of the class Halobacteria were located in a single cluster (Fig..
- Similarly, 33 out of 37 members of the order Methanosarcinales were gathered in a single cluster (Fig..
- Members of the order Sulfolobales were di- vided into a major cluster (31 genomes out of 39) and a minor cluster (8 genomes out of 39) (Fig.
- The 17 mem- bers of the order Methanococcales were divided into two neighboingr clusters (Fig.
- To quantify the relative contribution of the tax- onomy and of the GC content to the 5-mer compos- ition, we performed a permutational multivariate analysis of variance (PERMANOVA) (Additional file 1)..
- it alone ex- plained 75.94% of the cell profile dissimilarity vari- ance (model: D 5_cells ~ Genus), compared to 7.06% for phylum (D 5_cells ~ Phylum) and 17.74% for genus, when the effect of the phylum and order was first re- moved (D 5_cells ~ Phylum*Order*Genus)..
- Notably, the GC content alone contributed almost as much to the variance (69.10%, D 5_cells ~ GC%) as the taxonomic rank of the order (D 5_cells ~ order).
- These last two factors appeared to be highly dependent, explaining 56.71% of the cell dissimilarity variance (D 5_cells ~ order*GC%) in an indistinguishable manner..
- Despite the strong influence of the taxonomy, the glo- bal topology of the dendrogram obtained by hierarchical clustering was inconsistent with the phylogeny of ar- chaea.
- We therefore assumed that major properties of the environmental niches could be another important factor underlying the 5-mer composition among archaea.
- Among the 6 main clusters of the dendrogram for cells (Fig.
- Indeed, cytoplasmic pH regulation does not fully compensate for the de- crease in intracellular pH in acidic environments: the intracellular pH in acidophiles is higher by approxi- mately 3 to 4 points than that of the surrounding acidic environment, but on the whole, it is still lower than that in neutrophiles [38].
- Based on PERMANOVA, the “Niche” categories ex- plained 64.17% of the dataset variance (D 5_cells ~ Niche)..
- In particular, the last two factors explained 60.56% of the cell profile dissimilarity variance in an indistinguishable manner (D 5_cells ~ Order*Niche), consistent with the strong links between the ecological niche and the evolutionary his- tory in Archaea.
- Overall, a limited number of factors are therefore sufficient to explain the differences in 5- mer composition of the archaeal cell genomes included in our study..
- Consistent with this observa- tion, the taxonomy of the host at the order level ex- plained only 57.36% of the extrachromosomal element dissimilarity variance (Additional File 3, D 5_mobile ~ Host order), compared to 75.94% for the cells.
- the direction of these shifts in GC content varied greatly according to the host’s taxonomy (at the order level) and to the type of extrachromosomal element (Add- itional File 4).
- Since the GC content had a strong global influence on the obtained pattern (45.13% of the variance, Additional File 3, D 5_mobile ~ GC.
- Similar to cells, the host taxonomy (at the order level) and the genomic GC-content were highly interdepend- ent factors for extrachromosomal elements (Add- itional File of the dissimilarity variance was explained indistinguishably by these two factors (D 5_mo- bile ~ Host Order*GC% and D 5_mobile ~ GC.
- Interestingly, the taxonomic classification of vi- ruses and plasmids was by far the most influential factor, alone explaining 68.30% of the extrachromosomal elem- ent dissimilarity variance (Additional File 3, D 5_mobile ~ Family).
- This could be due partly to the high number of viral and plasmid families in the dataset (60 compared to only 11 different host orders), which must support a bet- ter fit of the model.
- The extrachromosomal element family and the tax- onomy of their hosts at the order level were strongly dependent, since 51.90% of the extrachromosomal elem- ent dissimilarity variance was explained indistinguishably by one of the factors (Additional File 3, D 5_mobile ~ Host Order*Family and D 5_mobile ~ Family*Host Order).
- A significant but weaker influence of the ecological niche on the 5-mer composition of archaeal extrachromosomal elements.
- The consistency of the 5-mer profile distribution with the “Niche” was lower than that for cells: the “Niche” ex- plained 50.12% of the dissimilarity variance from the.
- As we observed for cells, the infor- mation about the “Niche” was almost fully included in the host taxonomic classification, since the “Niche” ex- plained only 1.16% of the extrachromosomal element dataset variance when the influence of host taxonomy was first removed (Additional File 3, D 5_mobile ~ Host Order*Niche).
- A statistical model combining the gen- omic GC content, the ecological niche and the taxonomy of the host explained 70.85% of the profile dissimilarity variance (Additional File 3, D 5_mobile ~ Niche*Host Order*GC.
- Considering the strong association between the eco- logical niche and the 5-mer profile distribution, we de- cided to identify some of the most discriminant 5-mers between halophilic and nonhalophilic entities on the one hand, and between hyperthermophilic versus nonhy- perthermophilic entities on the other.
- Consistent with this, the average frequency of the ten most discriminant 5-mers was significantly different between halophiles and nonhalophiles (Mann-Whitney-Wilcoxon test, p <.
- Indeed, each of the ten discriminant 5-mers identified for the cells also had significantly different frequencies within extrachromosomal elements (Mann-Whitney-Wilcoxon test, p <.
- However, only 4 out of the 10 most.
- Eight of the 10 most discriminant 5-mers identified by PLS-DA (Add- itional file 16) had significantly different frequencies between the two groups (Mann-Whitney-Wilcoxon test, p <.
- Seven of the 10 most discriminant 5-mers identified for the cells also had significantly different levels in extrachromosomal elements (Additional file 18), indicating that the sig- natures of archaeal cells and extrachromosomal ele- ments with respect to hyperthermophily are similar without being strictly identical..
- We first compared, for halophiles and hyperthermo- philes, the 10 most discriminant 3-mers of the whole- genome sequences to their 10 most discriminant co- dons (Table 2).
- In each case, several of the most dis- criminant codons were also present among the most discriminant 3-mers of the whole genome sequences (Table 2, underlined words), which supported, as ex- pected, the link between codon frequencies and 3- mer composition in archaea and their extrachromo- somal elements..
- However, the majority of the most dis- criminant codons for hyperthermophily that we identified (Table 2) were not detected as differentially abundant in [44].
- In archaea and bacteria, the nature of the discriminant codons is likely influenced by prote- omic adaptation to temperature [45].
- For each 5-mer, the profile value consisted of an exceptionality score, reflect- ing how unexpectedly frequent or rare this 5-mer is, considering the codon composition of the sequence..
- The in- fluence of the niche was much lower on this new type of profile, decreasing from 64.22 to 41.75% for archaeal cells (D 5_cells ~ Niche and D 5_cells_e ~ Niche) and from 51.35 to 17.81% for mobile elements (D 5_mobile ~ Niche and D 5_mobile_e ~ Niche).
- The strong influence of the ecological niche on the 5-mer profiles is thus signifi- cantly but not exclusively explained by codon frequencies..
- Joint analysis of plasmid, viral and cellular genomes from Archaea highlights the influence of coevolution and of the extrachromosomal element families on 5-mer profiles To visualize a dendrogram encompassing both archaeal cells and their extrachromosomal elements, we created a smaller subset by randomly selecting approximately half of the sequences in each category (cell, virus and plas- mid) and we jointly analyzed the corresponding 5-mer profiles.
- It was less clear for the or- ders Methanobacteriales, Thermoproteales and Desulfur- ococcales, as well as Marine Group II, which were more dispersed at various locations of the dendrogram..
- While this trend of 5-mer profile similarity between extrachromosomal ele- ments and hosts has its exceptions, it still highlights the influence of the coevolution between hosts and their mobile elements on their short k-mer composition..
- Within each of the 4 abovementioned groups for which the association was the strongest (the class Halobacteria and orders Sulfolobales, Thermococcales, and Methanococ- cales), the cell and extrachromosomal element branches were not fully intertwined.
- This is particularly well illustrated by the case of the Sulfolobales order (Fig.
- Using PERMANOVA, it appeared again that the gen- omic GC content and the taxonomic family together ex- plained an important proportion of the 5-mer profile dissimilarity variance of extrachromosomal elements, namely, 55.52% (Additional file 20, D 5_mobile_halo.
- By contrast, the taxonomy of the host ex- plained only a very limited proportion of the variance, 5.28%, consistent with the loss of phylogenetic signal from the hosts within the class Halobacteria (Add- itional file 20, D 5_mobile_halo ~ Host order*Host genus)..
- Finally, HHTV-1 (Caudovirales order) was one of the outermost elements in the haloarchaea den- drogram (Fig.
- In the Sulfolobus-Acidianus cluster (Fig.
- This general pattern appeared once more to be partly linked to the GC content of the sequences (Fig.
- 5a, SSV-m1425, SSV-ls215 and SSV-yg5714): their sequences were less GC rich than those of the other.
- Finally, 12 out of the 13 pNOB8-like conjugative plasmids clus- tered together (Fig.
- Based on PERMANOVA, the viral and plasmid families together with the genomic GC con- tent explained 77.68% of the 5-mer profile dissimilarity variance among Sulfolobales mobile elements (Add- itional file 23, D 5_mobile_sulfo ~ Family*GC%)..
- For each pair of elements, the number of shared gene was divided by the lowest genome length of the pair.
- We identified a total of 51 outlier plasmids and viruses (Additional File 2) by combining a systematic approach (see Materials and Methods) and visual examination of the dendrograms.
- These elements had unexpected 5-mer compositions compared to the average in their taxonomic group or the 5-mer composition of their hosts..
- One of the previously described interorder host transfer events was indeed vis- ible by PCA (Fig.
- We then considered more closely the 13 pNOB8-like Sulfolobales conjugative plasmids because in a previous version of the dataset, two pNOB8-like plasmids, namely, pMGB1 and pTC, were located close to Metal- losphaera genomes, far from the main pNOB8-like clus- ter (Additional File 25).
- Our results were mostly consistent with previous stud- ies, but they provide a different view since most of the latter focused on amino acid composition .
- For virions in particular, it would be interesting to deter- mine whether the composition results exclusively from the coevolution with the hosts or whether other selective pressures are exerted, for instance on the packaging structure properties during the extracellular stage, corre- sponding to a more direct effect of the extracellular environment..
- Halobacteria members and their extrachromosomal elements showed a very strong signature at all studied levels: GC content, 5-mer and 3-mer compositions of the whole genome sequences and codon composition..
- We therefore may have missed some of the compositional changes that start to occur at lower temperatures.
- According to the literature, such differences could be explained by the presence of tRNA genes in the mobile element genome, enabling the uncoupling of codon usage constraints of the hosts from those of the mobile element [14, 48].
- by a large genome size of the mobile element, which is indicative of a more autono- mous replication cycle [14].
- or by a recent acquisition by the host, such that the composition of the mobile elem- ent has not yet undergone host adaptation [31].
- In the literature, the in- fluence of the host range and mode of transmission have been proposed, such as frequent changes of hosts [31] or.
- Comparison of pMGB1, a Sulfolobus plasmid of the pNOB8-like conjugative family, with a selected region of Metallosphaera sedulla DSM 5348 genome, showing the intergenus transfer.
- For plasmidions [62, 63] and viruses, additional constraints linked to packaging or structure can be imagined, in re- lation to but not limited to the properties of the extra- cellular environment..
- [35] mentioned that the difference in codon usage between chromosomes I and II of Haloarcula marismortui must be linked to the more recent acquisition of the second chromo- some.
- Even if our study covers a single domain of life, our observations suggest that the size of the mobile elements (plasmid or viruses) might be in fact the most important factor de- termining its importance in the evolutionary relation- ships with hosts.
- Our study provides a useful framework for the interpret- ation of k-mer approaches applied to cell or extrachro- mosomal elements of the domain Archaea.
- Presentation of the dataset and of the approach.
- Additional file 4 provides a synthetic view of GC% values across the dataset, according to the taxonomic order of the host and to the type of element.
- The dataset covered 3 and 8 orders of the phyla Crenarchaeota and Euryarchaeota, respectively.
- The profiles of the different genomes were then combined across the dataset to obtain two distinct matrices, one for each type of profile..
- The first type of profile was based on the 5-mer fre- quencies of the whole genome sequences.
- The obtained count data were imported into R [71] (ver- sion 3.4.2) and transformed into a frequency matrix to obtain normalized data: for each genome, the sum of the 5-mer frequencies was equal to 1..
- it reflected the exceptionality of the dif- ferent 5-mers in the coding regions after correcting for differences in codon composition in the studied genome..
- Statistical analyses of the profiles based on 5-mer composition.
- PCA were performed with the dudi.pca function of the ade4 package [73], on scaled and centered data..
- PERMANOVA of Euclidian distance matrices were conducted with the adonis function of the vegan.
- PERMANOVA assumes that 5-mer profiles re- spond linearly to changes in the covariates and that the variance of profiles is comparable across conditions of the data.
- The EGN parameters were set as follows: e-value threshold of 1e- 05, hit identity threshold of 30%, hit coverage of the shortest sequence of 60%, hit coverage of both sequences of at least 30%, minimal hit length of 20 amino acids, best reciprocity threshold of 10%.
- These values were subsequently normalized by dividing them by the smallest genome length of the concerned pair..
- For each viral or plasmid family, the distance of each ele- ment’s 5-mer profile to the profile barycenter of the con- sidered family was calculated.
- With this approach, imple- mented by a homemade R script, 18 outliers were identi- fied, of which 3 were removed after visual examination of the 5-mer frequency-based dendrograms.
- 5 Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France..
- A Glimpse of the genomic diversity of haloarchaeal tailed viruses.
- Bipartite network analysis of the Archaeal Virosphere: evolutionary connections between viruses and Capsidless Mobile elements.
- The global distribution and evolutionary history of the pT26-2 archaeal plasmid family.
- The genome of the square archaeon Haloquadratum walsbyi : life at the limits of water activity.
- Comparative analysis of the mosaic genomes of tailed Archaeal viruses and proviruses suggests common themes for Virion architecture and assembly with tailed viruses of Bacteria

Xem thử không khả dụng, vui lòng xem tại trang nguồn
hoặc xem Tóm tắt