« Home « Kết quả tìm kiếm

Comparison of different annotation tools for characterization of the complete chloroplast genome of Corylus avellana cv Tombul


Tóm tắt Xem thử

- Comparison of different annotation tools for characterization of the complete.
- chloroplast genome of Corylus avellana cv Tombul.
- Due to the limited genetic information available for European hazelnut (Corylus avellana L.) and as part of a genome sequencing project, we analyzed the complete chloroplast genome of the cultivar ‘ Tombul ’ with multiple annotation tools..
- Results: Three different annotation strategies were tested, and the complete cp genome of C.
- Comparative genomics indicated that the cp genome sequences were relatively highly conserved in species belonging to the same order.
- Simple sequence repeat (SSR) analysis showed that there were 83 SSRs in the cp genome of cv Tombul.
- Conclusion: In this study, the complete cp genome of Corylus avellana cv Tombul, the most widely cultivated variety in Turkey, was obtained and annotated, and additionally phylogenetic relationships were predicted among Fagales species.
- 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0.
- Full list of author information is available at the end of the article.
- Because of its conserved nature, the cp genome contributes to plant systematics and evolutionary studies [4–6].
- Due to the development of next generation se- quencing technology, an increasing number of WGS datasets are available for cp genome assembly.
- How- ever, this latter method relies on the availability of a high quality cp genome from a related species..
- Herein, we present the complete cp genome of Corylus avellana cv Tombul.
- The aim of the study was to com- pare different available annotation tools, develop an op- timized pipeline for cp assembly and annotation form WGS sequences, and examine the cp genome structure, gene content and gene order of Turkish hazelnut.
- Therefore, we chose to generate a new annotation for one of the most commer- cially important Turkish hazelnut cultivars, ‘Tombul’..
- Size, gene content, order and organization of the hazelnut.
- avellana cp genome previously published in GenBank (Accession no:.
- part, starting from 161,667 bp, consisted of repeats of sequences from the rest of the Tombul chloroplast genome.
- Although a subset of reads matched these additional parts in two segments, the mapped read depth of the these segments was approximately half of that of the rest of the cp genome.
- Moreover, BLAST alignment found that the additional part was 100% identical to two regions in the first 161 kb of the cv Tombul cp genome [38].
- These observations suggested that the extra 39 kb in our initial contig was an artefact of the NOVOplasty assembly algorithm, where the duplicated segments were incorporated twice, perhaps due to sequence variation at their boundaries..
- In addition, we examined whether a single circular cp genome could be retrieved using a standard whole.
- For this test, trimmed WGS sequences were assembled using ABySS assembler [39], and then the cv Tombul cp genome constructed by NOVOplasty and the KX822768 cp genome were mapped to these contigs of cv Tombul genome using BLAST.
- Therefore it was con- cluded that using an assembler specialized for organellar genomes is advantageous for cp genome construction;.
- further analysis was carried out using the first 161,667 bp of the genome assembly obtained from NOVOplasty, which also showed high similarity to the KX822768 cp genome..
- 1 The chloroplast genome map of Corylus avellana cv Tombul species.
- The darker gray area in the inner circle denotes GC content while the lighter gray corresponds to the AT content of the genome.
- The overall GC content of cv Tombul cp genome was 36.40%, and GC contents of the LSC and the SSC re- gions were 34.17 and 30.25%, respectively.
- The GC con- tent of the IR region was much higher than that of the LSC and SSC regions with 42.37%, due to its relatively abundant GC-rich tRNA and rRNA genes..
- These agreed with each other for the majority of the content and order of genes [40–43]..
- Generally, genes were included in the final map when at least 2 of the tools gave matching predictions.
- A total of 125 predicted functional genes were encoded within the Corylus avellana cv Tombul cp genome.
- A ycf-like gene was also reported in the IRB region, one of the two IRs, with two annotation tools, DOGMA and GeSeq, but it was a truncated frag- ment of ycf1 gene, and thus not included in the genome map.
- Of the 76 unique protein-coding genes, five genes (atpF, ndhA, ndhB, rpl2, and rpoC1) contained one intron, while two protein-coding genes (clpP and ycf3) contained two introns each.
- Several nucleotide alterations are required to provide functional start codons in a handful of the genes annotated in the present study (Additional file 1: Table S3).
- Comparing the results of the annotation tools, ten genes (atpF, clpP, ndhA, ndhB, ndhK, petA, rpl2, rpoC1, ycf3, ycf15) were erroneously reported twice as 2 gene fragments by DOGMA and GeSeq, whereas they were correctly reported as a single gene containing an intron by cpGAVAS (Additional file 1: Table S9).
- When the annotated genes were compared with those previously reported in other species’ chloroplast sequences, the GeSeq tool gave the most accurate results for gene loca- tions, including starting and end points of the CDS..
- All of the genome and annotation information is shown in Fig.
- Based on a sequence similarity search of the whole genome, the C.
- In addition, Carpinus and Ostrya families also showed high similarity with cv Tombul cp genome with nearly 98.91 and 99.21% iden- tity, respectively.
- The similarities and differences of the cp genome be- tween C.
- avellana cv Tombul and other species,.
- including representatives of the Malpighiales, Fabales and Brassicales, were determined by a global alignment program, mVISTA [48].
- avellana cv Tombul as a reference (Fig.
- Tombul had a similar cp genome size to the other species, which range from 152,217 bp to 161,303 bp (Tombul cp gen- ome size is 161,667 bp).
- Furthermore, a region was de- tected in the cv Tombul cp genome from ~ 68 to 69 kb that was conserved with KX822768 but none of the other species presented in the global alignment.
- This re- gion contained duplicates of the psbF, psbJ and psbL genes from the adjacent region, and an unprocessed petA gene.
- According to the MISA web tool, a total of 83 SSRs were identified in the cv Tombul cp genome [49].
- While most of the mononucleotides were com- posed of A/T (90.9.
- most of the dinucleotides were AT/.
- The complete cp genome sequences of 22 species from Fagales order were obtained from the NCBI and used for phylogenetic analysis, including representatives of genera of Betulaceae, Fagaceae, and Juglandaceae.
- As chloroplast protein sequences showed high similarity among related species, the phylogenetic analysis was carried out using the whole cp genome sequences.
- This study reported a complete cp genome sequence of Corylus avellana cv Tombul, annotated by different available annotation tools.
- Initially, the de novo assembler NOVOPlasty was used to reconstitute the Tombul cp genome (Fig.
- The comparison of the contig with the KX822768 cp genome, published in GenBank, indicated that the last part of the sequence bp), was nearly identical to other segments of the Tombul chloroplast genome.
- Therefore, we considered the possibility that the cp genome of cv Tombul could be physically larger than the reported C.
- avellana cp genome.
- How- ever, BLAST results indicated that this part consisted of two segments, each of which was 100% identical to a region in the first 161 kb of the cv Tombul cp genome (Additional file 1: Figure S2) [47].
- Furthermore, the mapped read depth of the duplicated segments was ap- proximately half of that of the rest of the cp genome..
- Hence, we concluded that the additional 39 kb was an artefact of the NOVOplasty assembly algorithm.
- Further analysis was carried out using the first 161,667 bp of the genome assembly..
- Comparison of methods for annotation of cp genome for cv Tombul.
- The cv Tombul cp genome presented similar character- istics to other angiosperm cp genomes.
- While the general characteristics of cv Tom- bul cp genome are highly consistent with KX822768, a few differences were detected at the gene level.
- sequences were predicted for these genes in the cv Tom- bul cp genome.
- Furthermore, the genes accD, psbM, and trnI-GAT, were not annotated in KX822768, but they were present in the cv Tombul cp genome.
- Lastly, psbF, psbJ and psbL genes were found twice in the cv Tombul cp genome (Additional file 1: Table S7).
- The length of the cv Tombul cp genome was found to be similar to other Corylus and Quercus species, but a difference was indicated with Populus and Juglans species [53–57].
- The annotation of the cv Tombul cp genome was car- ried out using three different tools, cpGAVAS, DOGMA and GeSeq [40–43].
- Because it is not suitable for defining the start and end of exons, the DOGMA annotation needs man- ual editing, and additionally the identification of the IR region was not supported by this tool.
- If a cp genome belonging to a closely related taxon is available, the GeSeq annotation tool is the most useful for the ana- lysis.
- In other cases, annotation with both GeSeq and cpGAVAS, followed by comparison of the results from both tools, provides the most precise information about functional genes and locations with minimal configuration..
- 4 Flow chart describing the optimized bioinformatics pipeline for cp genome assembly.
- species, indicating that cv Tombul cp genome contains largely the same coding genes, tRNAs and rRNAs..
- However, the length of the cp genome from cv Tom- bul slightly differed from the published sequence KX822768 in GenBank, from which it could be in- ferred that some genetic differences exist even be- tween cultivars..
- avellana cv Tombul cp genome by using WGS sequences generated as part of a whole genome sequencing project.
- The cp genome of cv Tombul has a typical cp genome structure, and is highly similar to other cp genomes of the Betulaceae family.
- In the future, we are considering wider cp genome sampling of other cul- tivated varieties, to investigate whether cultivar specific markers exist or not, and the development of molecular markers for deeper information about phylogeny..
- Relative positions were manually curated according to the reference genome, and the complete cp genome for Tombul cultivar was fi- nally acquired for further analysis.
- Then, the cv Tombul cp genome obtained from NOVOplasty was aligned to the ABySS contigs using BLAST..
- The Tombul cp genome was annotated through three different online programs, GeSeq, CpGAVAS and DOGMA with default parameters [40–43].
- MEGA pairwise alignment was additionally used to confirm the genes among closely related taxa, and the gene locations were verified from cv Tombul cp genome sequences.
- The complete cp genome sequences of 22 species from Fagales were used for phylogenetic analysis, including representatives of genera from the Betulaceae, Fagaceae, and Juglandaceae.
- The bootstrap consensus tree inferred from 500 replicates was taken to represent the evolutionary history of the taxa analyzed.
- BLAST result of the cv Tombul chloroplast genome against Viridiplantae (best 100 hits).
- Simple sequence repeats within the cv Tombul chloroplast genome.
- Differences between cv Tombul and KX822768 cp genome published in GenBank.
- Comparative analysis of the complete chloroplast genome sequences in psammophytic Haloxylon species (Amaranthaceae).
- An assessment of the genetic diversity and population genetic structure concerning the Corylus heterophylla Fisch., grown in the Tieling district of Liaoning province, using SSR markers.
- ISSR variation in olive- tree cultivars from Morocco and other western countries of the Mediterranean Basin.
- Power and limitations of the chloroplast trnL (UAA) intron for plant DNA barcoding.
- Phylogeny and evolution of the Betulaceae as inferred from DNA sequences, morphology, and paleobotany..
- The complete chloroplast genome sequence of the endangered Chinese endemic tree Corylus fargesii.
- Comparative analysis of the complete chloroplast genomes of five Quercus species.
- Characterization of the complete chloroplast genomes of five Populus species from the western Sichuan plateau, southwest China: comparative and phylogenetic analyses..
- The complete chloroplast genome sequence of the wild Chinese chestnut (Castanea mollissima).
- The complete sequence of the Acacia ligulata chloroplast genome reveals a highly divergent clpP1 gene.
- The complete chloroplast genome sequence of the medicinal plant Salvia miltiorrhiza..
- A phylogenetic analysis of Hydrangeaceae based on sequences of the plastid gene matK and their combination with rbcL and morphological data.
- Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees

Xem thử không khả dụng, vui lòng xem tại trang nguồn
hoặc xem Tóm tắt