« Home « Kết quả tìm kiếm

Comprehensive analysis of genetic and evolutionary features of the hepatitis E virus


Tóm tắt Xem thử

- Comprehensive analysis of genetic and evolutionary features of the hepatitis E virus.
- Conclusion: In this study, we estimate that the common ancestor of the modern HEV strains emerged ~ 6000 years ago, in the period following the domestication of pigs.
- evolution of the codon usage of HEV ORFs.
- Hepatitis E virus (HEV), a member of the genus Orthohepevirus in the family Hepeviridae, is a non- enveloped positive-sense RNA virus, with a full-length genome of 7.2 kb [1].
- 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0.
- Full list of author information is available at the end of the article Baha et al.
- The study of codon usage patterns can provide useful insights into the molecular evolution, extend our under- standing of the regulation of viral gene expression, and improve vaccine design, for which the efficient expres- sion of viral proteins may be required to generate effi- cient immune responses.
- Therefore, given the continuously growing number of the reported HEV genome sequences, in this study, we performed an up to date comprehensive analysis of the composition and codon usage features of HEV full- genomes reported between 1982 and 2017, followed by Bayesian phylogenetics analysis to retrace the evolution- ary history of HEV..
- RSCU patterns of the HEV coding sequences.
- More codon over-representation was Table 1 Nucleotide composition of the HEV ORFs.
- The genotype-specific RSCU patterns highlight the in- dependent evolutionary dynamics of the HEV isolates.
- Correspondence analysis of the RSCU variations in the HEV ORFs.
- The results revealed that the first and second principal axes accounted for the majority of the data inertia (ORF1: ƒ ƒ ORF2: ƒ´.
- The HEV genotypes had different codon usage biases.
- For ORF1 and ORF2, HEV strains of genotype 1, 3 and 4 were grouped into three well- Table 2 RSCU patterns of the HEV ORFs.
- Table 2 RSCU patterns of the HEV ORFs (Continued) Amino.
- Furthermore, the clus- tering of genotype 1, 3 and 4 strains was very consist- ent with the phylogenetic classification of the HEV complete genome reported by Smith et al.
- On the other hand, the analysis of ORF3s showed that the HEV strains were grouped into only two clusters: a cluster composed of HEV genotype 1 and 2 strains, and a cluster of the remaining strains, indicating that the RCSU values of ORF3s allow the distinction be- tween human HEV genotypes and zoonotic genotypes (H and Z genotypes) (Fig.
- The variation of the effective number of codons among the HEV ORFs.
- To estimate the degree of the codon usage bias within the three HEV ORFs, the ENC values were computed..
- Regardless of the genotype, an overall mean value of and were obtained for ORF1, ORF2, and ORF3 respectively.
- Further, the analysis of the ENC between the dif- ferent genotypes revealed, as shown in Fig.
- The multi-comparison of the ENC values between the ORFs of genotypes 1, 3 and 4 revealed that all the differ- ences were statistically significant except between the ORF2 of genotype 1 and the ORF2 of genotype 4.
- Codon usage adaptation of the HEV ORFs to different hosts.
- The CAI values range from 0 to 1, being 1 if the fre- quency of codon usage by the virus equals the frequency of codon usage of the reference set.
- The E- CAI server calculates the expected value of the CAI by generating 500 sequences that have similar nucleotide content and amino acid composition as the sequence of interest (in this case a given HEV ORF sequence), and then, a Kolmogorov–Smirnov test was applied to con- firm that the generated random sequences show a nor- mal distribution.
- The E-CAI values were used to discern whether the differences in CAI are statistically signifi- cant and arise from the codon preferences or whether they are just artifacts related to the internal biases in the G + C composition and/or amino acid composition of the query sequences.
- An N-CAI value greater than 1 indicates that the adaptation process in the codon usage is statistically significant and independ- ent of the nucleotide and amino acid composition [14]..
- Regardless of the genotype, the ORF1 was signifi- cantly well adapted to Macaca fascicularis codon usage (N-CAI whereas ORF2 was sig- nificantly adapted to Homo sapiens (N-CAI and Macaca fascicularis (N-CAI .
- Concerning the genotype-specific pattern of the N-CAI (Fig.
- 3b, c, d, and Additional file 5: Table S6), the results showed that for ORF2 sequences, no discriminant separation of the HEV strains was observed.
- 3b and d is in accordance with the classification of HEV strains into human genotypes and zoonotic geno- types, which suggests that codon adaptation could play a pivotal role in viral host tropism as well as the severity of the infection (the epidemic character of the HEV genotype 1 infections)..
- Similarity analysis between the codon usage bias of the HEV ORFs and the HEV hosts.
- To determine the potential influence of the codon usage patterns of the main hosts on the evolution of the codon usage patterns of HEV coding sequences, a similarity analysis was conducted.
- In this method, each one of the 59 synonymous codons is taken into account and analyzed all together to estimate the similarity of the overall codon usage patterns between HEV and its host, rather than one to one codon com- parison.
- The results showed that in comparison to all hosts, the ORF3 had the highest degree of similarity followed by ORF2 and ORF1, with the strongest simi- larities of the three ORFs registered with Sus scrofa domestica.
- To determine whether the codon usage patterns of the HEV ORFs sequences have been shaped solely by.
- The effective number of codons ENC was plotted against the percentage of GC at the third codon position GC3s for each of the three HEV ORFs separately (Fig.
- influence of natural selection in the codon usage pattern of HEV ORF1 and ORF2.
- 3 Discriminant analysis based on the normalized codon adaptation index (N-CAI) of the HEV ORFs in relation to all the hosts.
- All three HEV ORFs were analyzed together regardless of the genotype and the data were colored according to the ORF (a).
- The results show that U and C were used more frequently than G and A in the fourfold degenerate codon families in all HEV ORFs, regardless of the genotype (Additional file 7: Figure S2).
- However, the slopes of the regression line in ORF1, ORF2, and ORF3 were calculated to be and 0.082, respect- ively (Additional file 8: Figure S3), indicating that the in- fluence of direct mutation pressure on the codon usage bias in ORF1, ORF2, and ORF3 was only 4.7, 9.3, and 8.2%, respectively.
- A significant correlation was ob- served for ORF3 sequences in genotypes 1 and 3, with a slope of the regression line of 0.29 and 0.122, giving a mutation pressure rate of 2.9 and 1.2%, and a natural se- lection rate of 97.1 and 98.8%, respectively.
- The data from the analysis of the 3′ end of HEV ORF1 (Fig.
- Figure S5, and Additional file 11: Figure S6) suggest that the mean time of emergence of the ancestor for the major HEV genotypes infecting humans (Genotypes 1, 2, 3 and 4) ranged from 644 to 738 years ago.
- This observation was noted in two of the models the strict and uncorrelated lognormal clock models, whereas the uncorrelated expo- nential clock model suggested that these three genotypes shared the same common ancestor but emerged later in the early 1800s.
- In the present study, we analyzed the codon usage pat- terns and the evolutionary history of the three HEV cod- ing sequences (ORF1, ORF2, and ORF3) to determine and shed some light on the factors governing their mo- lecular evolution.
- The reason for these two opposite patterns in nucleotide bias is not clear, but as suggested previously, this bias could be the result of an adaptation of a common ancestor of modern HEV strains to the requirement, in terms of nu- cleotide composition, of the host during the evolutionary process [17].
- Next, we analyzed the ENC of the HEV ORFs to evalu- ate the extent of the codon usage bias in HEV genes..
- 5 Bayesian phylogenetic maximum clade credibility (MCC) tree for 183 sequences of HEV ORF1 (852 nt of the 3 ′ end).
- The numbers at each tree represent the mean values for age of the most recent common ancestor (MRCA) at that node (PP = posterior probability).
- It is to note that in this latter study, besides the small number of the analyzed sequences, the parity rule 2 and neutral- ity plot analyses were not performed and the ENC-GC3s plots were constructed for the whole HEV genomes..
- Further, we analyzed the codon adaptation of the HEV ORFs to different hosts independently of nucleotide con- tent and amino acid composition using the normalized codon adaptation index [14].
- Following this reasoning, the adaptive genetic changes observed in the NS1 gene of Zika virus were suggested as an explanation for the emergence of ZIKV in humans and the increase in viral fitness of the Asian lineages [24].
- Moreover, the clustering of the HEV strains in the correspondence analysis and discriminant analyses (based on RSCU and N-CAI, respectively) was consistent with the genotypic classification based on HEV complete genomes proposed by Smith et al.
- Our estimation of the evolutionary rate based on the analysis of the 3′ end of HEV ORF1 (852 nt) ranged be- tween 1 and subs/site/year (strict and log- normal clock models).
- Herein, we dated the origin of the HEV genotypes 1, 2 3 and 4 the end of the thirteenth century, a date which falls within the previous estimates [27].
- This raises a question why the all reported evolutionary models [27, 29] converged when estimating the recent apparition of human genotypes 1 and 2? If the host of the HEV ancestor was human and it spread later on to other species, then intuitively, it is more expected to find the human genotypes appearing earlier, and after adap- tive changes, the other genotypes emerged.
- Similar results were previously reported and discussed with minor divergence on the dates that can be explained by the number of sequences included, the fragment of the genome analyzed and the evolutionary.
- In conclusion, our results suggest that the common an- cestor of the modern HEV strains emerged ~ 6800 years ago, in the period following the domestication of pigs and the intensification of agriculture.
- However, fur- ther history or fossil record findings as well as the isolation of new HEV strains from more hosts are need for the determination of the accurate evolutionary his- tory of HEV..
- The detailed information of the selected HEV complete genomes is listed in Additional file 1: Table S1..
- The overall frequency of oc- currence of the nucleotides (A%, C%, T/U%, and G.
- Relative synonymous codon usage (RSCU) and correspondence analysis (COA).
- The RSCU values for all of the coding sequences of HEV genomes were calculated to determine the characteristics of synonymous codon usage without the confounding influence of amino acid composition and coding se- quence size of the different gene samples as was pro- posed by Sharp and Li in 1986 [33].
- The relative synonymous codon usage is the ratio of the observed frequency of a codon to the ex- pected frequency of a codon if all the synonymous co- dons for a particular amino acid are used equally..
- Next, a Spearman’s rank correlation analysis was used to identify the relationship between nucleotide composition and the first two axes (Axis 1 and Axis 2) of the COA of HEV RSCU values.
- In this study, the ENC analysis was used to quantify the absolute codon usage bias by evaluating the degree of codon usage bias displayed by the HEV coding sequences, regardless of the gene lengths and the number of amino acids.
- To statistically analyze the ENC values of the HEV ORFs among the different genotypes, a one-way ANOVA was used to compare the groups’ means.
- Then, Games-Howell post hoc test was adopted for multiple-comparison of the ENC values between the different genotypes.
- genotypes 6, 7 and 8 were included in testing the inequality of the means but only genotypes 1, 3 and 4 were included in the multiple-comparison test..
- Analysis of the codon adaptation index (CAI).
- Com- parative analysis of the codon usage was implemented between HEV genes and viral hosts: humans (Homo sa- piens), Cynomolgus monkey (Macaca fascicularis), rhe- sus monkey (Macaca mulatta), wild-boar (Sus scrofa), pig (Sus scrofa domestica), rabbit (Oryctolagus cuniculus) and camel (Camelus dromedarius and Camelus.
- In order to calculate the influence of complete codon usage of the main hosts on HEV ORFs codon usage, the similarity index analysis was carried out.
- The (a i ) represents RSCU value for a specific codon among all the synonymous codons, whereas (b i ) is the RSCU value for the same codon of the host.
- D(A, B) shows the possible effect of the host overall codon usage on HEV ORFs with a value ranging from 0 to 1.0.
- Codon usage tables of the hosts were retrieved from the Codon Usage Database (http://www.kazusa.or.jp/codon/)..
- Effect of mutation pressure and translational selection on the codon usage pattern of HEV ORFs.
- Further, analysis of the correlation between the GC contents at the first and second codon positions (GC12) and that at the third codon position (GC3) is useful to investigate the varying roles of mutational pressure and natural selection in shaping the codon usage bias of HEV genes.
- The GC-bias [G3/(G3 + C3)] at the third codon position of the four-codon amino acids (alanine, arginine, glycine, leucine, proline, serine, threonine and valine) of the en- tire genes was plotted against the AU-bias [A3/(A3 + U3)] at the same codon position of the same amino acids.
- The center of the plot, where both coordinates are 0.5, is the place where A = U and G = C (PR2), with no biases between the influence of mutation and selection rates (substitution rates) [38]..
- The age of the most recent common ancestor (tMRCA) was estimated using the Bayesian Markov Chain Monte Carlo (MCMC) statistical framework implemented in BEAST v1.10.4 package .
- Using a constant size coalescent prior as tree prior, strict molecular clock model, assuming a single evolutionary rate for each branch of the tree, versus relaxed uncorrelated models (lognormal and exponential), in which the rate of evolution is allowed to vary among branches, were compared.
- Information of the selected HEV sequences included in the study..
- Nucleotide composition of the HEV coding sequences..
- RSCU patterns of the HEV coding sequences, sorted by ORF and by genotype.
- Group membership prediction according to the discriminant function obtained in discriminant analysis based on the normalized codon adaptation index (N-CAI) of the HEV ORFs in relation to all the hosts..
- Analysis of the similarity index of the codon usage between HEV strains and its main hosts.
- All three HEV ORFs were analyzed together regardless of the genotype and the data were colored according to the ORF (A).
- A series of two-way ANOVA was per- formed using the Host as the first independent nominal variable and ORFs regardless of the genotype (A), ORF1 (B), ORF2 (C) or ORF3 (D) as the second independent nominal variables, followed by Bonferroni ’ post hoc test.
- G + C contents of the first codon position P1, G + C contents of the second codon position P2 , neutrality plots (GC 1,2S (P1,2) and that of the third codon position (GC 3S, P3) were constructed for all three HEV ORFs and individual HEV ORFs..
- Detailed Bayesian phylogenetic maximum clade credibility (MCC) tree for 183 sequences of HEV ORF1 (852 nt of the 3 ′ end).
- The numbers at each tree represent the mean values for age of the most recent common ancestor (MRCA) at that node.
- RSCU: Relative synonymous codon usage.
- SB, JM, NB, and RS contributed to the interpretation of the results.
- All authors provided critical feedback and helped shape the final version of the manuscript.
- The funding body played no role in the design of the study, execution of the analyses, interpretation of the results, and writing of the manuscript..
- Consensus proposals for classification of the family Hepeviridae.
- Codon usage bias and the evolution of influenza a viruses.
- Translation-coupled violation of parity rule 2 in human genes is not the cause of heterogeneity of the DNA G+C content of third codon position

Xem thử không khả dụng, vui lòng xem tại trang nguồn
hoặc xem Tóm tắt