« Home « Kết quả tìm kiếm

Genomic tools for durum wheat breeding: De novo assembly of Svevo transcriptome and SNP discovery in elite germplasm


Tóm tắt Xem thử

- Genomic tools for durum wheat breeding:.
- Background: The tetraploid durum wheat (Triticum turgidum L.
- A deeper characterisation of the molecular and functional diversity of the durum wheat transcriptome will be instrumental to more effectively harness its genetic diversity..
- Results: We report on the de novo transcriptome assembly of durum wheat cultivar ‘ Svevo.
- The presence of differential expression between the A- and B-homoeolog copies of the durum wheat tetraploid genome was ascertained by phase reconstruction of polymorphic sites based on the T..
- Conclusions: Our study updates and expands the de novo transcriptome reference assembly available for durum wheat.
- Out of 180,108 assembled transcripts, 13,636 were specific to the Svevo cultivar as compared to the only other reference transcriptome available for durum, thus contributing to the identification of the tetraploid wheat pan-transcriptome.
- 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0.
- Linussio 51, 33100 Udine, Italy Full list of author information is available at the end of the article.
- Durum wheat (Triticum turgidum L.
- For the above reasons, de novo as- sembly of the transcriptome is essential for the identifi- cation of candidate genes, the development of SNP markers and genomic analyses.
- Moreover, particularly important is the correct identification, hence separation, of the homoeolog sequences..
- [8] to obtain a high-quality transcriptome assembly of the durum wheat cultivar Kronos.
- An evalu- ation of the assembly showed that 96% of a benchmark full-length cDNA dataset [11] is assembled in a single contig..
- Herein we report a de novo assembly of the tetraploid wheat transcriptome of cultivar Svevo as a complement to the reference transcriptome from cv.
- A detailed sum- mary of the de novo assemblies is described in Add- itional file 3: Table S3).
- In CLC, the use of the longest available k-mer size (k = 64 bp) provided the best per- formance in terms of contig length, with values close to the expected length distribution for transcripts/genes (Additional file 4: Figure S1).
- The contig number obtained from the ana- lysis of pooled reads was higher than those obtained with any of the four plant combinations of tissues ana- lysed separately, while maintaining a N50 value compar- able to that obtained for the best organ-specific assembly, i.e.
- Gene completeness of the de novo assembly was esti- mated based on two independent samples of validated wheat genes using a procedure that included counting the number of contigs necessary to reconstruct each gene and then evaluating the corresponding percentage of recon- struction.
- To assess the complete reconstruction of the two homo- eolog genes, a set of 58 T.
- Out of the remaining 626,888 SNPs that were classified as varietal-SNPs, 33,747 were single dose, locus-specific SNPs (5.38% of the varietal SNP, with Mendelian diploid behaviour) and 497,783 were considered as varietal-hemi-SNPs, 95,358 of which were non-rare SNPs (i.e.
- aestivum SNPs and used to assay a worldwide panel of 288 durum wheat elite accessions (Table 2) [17].
- The allelic frequency of the polymorphic SNPs obtained from T.
- Homoeolog-specific expression in durum wheat.
- Out of the 34,879 transcripts in the reference set, 7040 successfully generated at least one phased block of sequences.
- A cluster analysis of expression levels confirmed that most of the variabil- ity is accounted by the tissue type, then by homoeologs..
- Overall, 1113 genes (15.81%) showed higher expression of the A-genome homoeologs, while 695 genes (9.87%) a sig- nificant overexpression of the B-genome.
- 5), we observed a constant trend in favour of the A-genome homoeolog.
- 1%) with contrasting homoeolog-fold-changes across tissues, where none of them were differentially expressed for each of the three tissues.
- Based on a previous survey of diversity in a worldwide panel of elite durum wheat [19].
- The sampled tissues and NGS methodology of the herein reported Svevo.
- This apparently negative result is due to the more stringent assembly pa- rameters used herein, which allowed for a reduced re- dundancy of the assembly and for the identification of exons and whole genes missing in the Kronos assembly..
- One of the most commonly used tools, Trinity [20], was tested as well.
- however, after a few tests it was aban- doned since the results obtained did not significantly im- prove the quality of the assembly, coupled with an impractical request of resources.
- A SNP was considered when at least seven out of the 13 varieties were sufficiently covered.
- aestivum SNPs assessed on the durum wheat panel.
- aestivum-derived SNPs within a panel of 288 durum wheat accessions as representative of the worldwide breeding germplasm.
- Within the scope of the current study, the ones with the largest k-mer values produced the best assemblies with the mostly contiguous sequences.
- Coverage of at least one homeolog was assessed to 84% but the reconstruction of both copies of homoeolog genes of the tetraploid durum was more dif- ficult, with an estimation of 40% of sequences based on a benchmark of 58 genes from whole genome assembly..
- In green is reported genes with over- expression of the A genome, in red those that showed higher expression in the B genome.
- In hexaploid wheat, the use of the emerging single-molecule real-time (SMRT) sequencing technology (Pacific Biosciences) allowed for a massive sequencing of full-length non-chimeric reads, 74.6% of which corresponded to a complete open reading frame [28].
- This technology allowed for a detailed investigation of the transcriptome of developing grains and led to obtain a more complete picture of the gluten gene transcript, including the iden- tification of many pseudogene transcripts and a clear discrimination between homoeologs and paralogs.
- One of the reasons for the relatively limited number of transcripts assigned with high confidence to a very specific function is a low coverage of the matched protein by the transcript.
- Nevertheless, the transcripts annotated in this work provide valuable information towards the identifi- cation of the expressed portion of the genome in Triti- cum.
- Transferability of the SNP panel.
- GWAS) and breeding applications such as marker-assisted selec- tion in the cultivated durum wheat germplasm..
- Interestingly, the frequency of the rarest simple SNPs is 11.6% higher than the frequency of the corre- sponding class in the hemi-SNPs.
- This may reflect mul- tiple causes including, among others, the stringent definition given in this paper to simple SNPs, namely ei- ther the presence of both homoeolog sequences in the de novo reference with assignment of reads only to the correct chromosome or deletion of one of the homoeo- log chromosomes with a consequent lack of reads from the other chromosome..
- We assayed the SNP information content by including a subset of 7940 tetraploid SNPs to the iSelect Illumina 90 K wheat assay [17] where also a considerable portion of the functional T.
- aestivum set of SNPs was also expected, considering that after polyploidisation the germplasm pools of the two species, and particularly the elite germ- plasm cultivated nowadays, have undergone multiple events of population size reduction, drift and selection, and novel introduction [4, 37].
- durum SNP dataset was biased towards high MAF values because of the SNP selection process that was based on SNPs that were confirmed in at least three of the 13 reference cultivars.
- Mapping RNA-seq reads on the double reference allowed us to distinguish between (i) reads mapping to one of the two alternative transcripts (i.e.
- Although previously described precautions were taken, still 15.81% of genes showed higher expression of the A-genome homoeolog versus 9.87% of the B-genome homoeolog (Fig.
- If, on one side, this indicates that about as many as 10% of the genes are more expressed in the A-genome, on the other side, it may indicate a possible subfunctionalisation of a high number of genes..
- This study presents a de novo transcriptome assembly of Svevo, a modern elite durum wheat cultivar widely used in breeding programs.
- This tran- scriptome assembly expands the existing publicly avail- able reference transcriptome of the durum wheat variety Kronos and contributes towards a novel and more complete transcriptome information to the ongoing gen- omics studies in tetraploid durum wheat.
- with 78% of the reconstructed sequences being function- ally annotated, including GO terms and PFAM domains..
- The RNA-seq data of the 13 elite durum wheat varieties provides a relevant number of novel T.
- Using a double-reference mapping procedure, we first investigated the homoeolog-specific expression in durum wheat and vali- dated the method.
- This latter exercise suggested that homoeolog-specific expression may contribute to the di- versity across varieties and further large-scale studies may provide a better understanding of the interplay of two subgenomes across germplasm..
- The Italian durum wheat (Triticum turgidum subsp..
- Svevo, released in 1996 (CIMMYT line/Zenit) was selected for the reconstruction of the durum wheat reference transcriptome, as it has been a quality and productivity reference variety of durum wheat in Italy for more than a decade.
- Thirteen elite varieties representing the diversity of the worldwide durum germplasm (Additional file 1: Table S1), including the selected reference cv.
- Trimmed paired reads from either single tissues or from the bulk of the four tissues were initially used to create de novo transcriptome assemblies using either CLC-Genomics Workbench v5.1 (CLC Bio, Aar- hus, Denmark) or Velvet [45] paired with Oases [46].
- The completeness of the assemblies was assessed in terms of percentage of gene reconstruction.
- For each gene, the cumulative length of the matching hits divided by the gene length was computed and defined “gene coverage” (Additional file 9: Table S6, Additional file 10: Table S7).
- A more general assessment was the comparison of the transcriptome with the plants dataset of BUSCO [47]..
- BUSCO provides quantitative measures of the complete- ness of genomes and transcriptomes in terms of ex- pected gene content.
- Genes that make up the BUSCO sets for each major lineage were selected from ortholo- gous groups with genes present as single-copy orthologs in at least 90% of the species.
- Annotation of the reconstructed transcriptome was based on a comparative genomics approach.
- The quality-trimmed and contaminant-free reads of the 13 varieties were independently aligned versus the se- lected Svevo de novo transcriptome (minimum similarity 0.8, minimum aligned length 0.9) and SNPs called with CLC-Genomics Workbench v5.1 (window length 11, maximum gap and mismatch count 6, minimum central quality 20, minimum average quality 15, minimum coverage 8, minimum variant frequency 10%, sufficient variant count threshold 1000, required variant count threshold 4)..
- Each row of the table, corresponding to a hypothetical SNP in at least one variety, was then classified, with an- other internally developed Perl script, as either locus-specific SNP, hemi-SNP, inter-homoeolog SNP,.
- Considering that in this work only inbred lines were used, a heterozygous call supposedly corresponds to a difference between two homoeolog chromosomes and not to a difference between the two alleles of the same chromosome.
- Genotypic data from a panel of 288 durum wheat acces- sions representative of the worldwide breeding germ- plasm were used for the SNP validation.
- In the durum wheat panel, the number of total SNPs, failed SNPs and polymorphic SNPs were counted for both hexaploid wheat (T.
- aestivum) and durum wheat.
- The MAF distribution was inspected for the polymorphic SNPs in both hexaploid and durum wheat..
- Illumina RNA-seq paired reads from leaf, root and grain tissues for each of the 13 durum varieties were mapped against the new reference set containing both homoeolog-derived sequences using BWA aligner with default parameters [57].
- To do this, for each of the 13 varieties, six seeds were grown in a paper-roll in a growth chamber at controlled light and temperature conditions [60].
- In order to maximize the chances of the gene-specificity of the assay, we chose for primer design differentially expressed transcripts that had no paralogs or tandem duplications in the reference bread wheat IWGSC Refseq v1.0 [61] and wild emmer wheat ge- nomes [14].
- at least one of the primers of each assay was designed to include the 3′-end homoeo- log SNPs.
- The expres- sion of the transcripts was normalized using the house- keeping gene actin.
- Durum wheat accessions description and RNA-Seq data.
- Accession name, pedigree, accession feature and the con- stitutor for the 13 varieties used for RNA sequencing and SNP discovery are reported along with the associated sequencing yields for each of the three sampled tissues.
- Summary of RNA sequencing for durum wheat cv.
- Detailed report of the de novo assemblies..
- List of the 58 wheat genes from chromosome 3 with the relative accession numbers.
- Transcriptomes of the plant species used for BLASTx.
- Results are reported for each of the thirteen varieties.
- The dataset of SNPs among the 13 durum wheat cultivars analysed in the current study are available under request to the authors upon material transfer agreement (MTA).
- Megabase level sequencing reveals contrasted organization and evolution patterns of the wheat gene and transposable element spaces.
- Genetics and genomics of the Triticeae.
- A chromosome-based draft sequence of the hexaploidy bread wheat (Triticum aestivum) genome.
- A chromosome conformation capture ordered sequence of the barley genome.
- Use of mRNA-seq to discriminate contributions to the transcriptome from the constituent genomes of the polyploid crop species Brassica napus.
- Deep transcriptome sequencing provides new insights into the structural and functional organization of the wheat genome.
- Draft genome of the wheat A- genome progenitor Triticum urartu.
- Genome sequence of the progenitor of wheat a subgenome Triticum urartu.
- Genome sequence of the progenitor of the wheat D genome Aegilops tauschii.
- Extended and robust protein sequence annotation over conservative nonhierarchical clusters: The case study of the ABC transporters

Xem thử không khả dụng, vui lòng xem tại trang nguồn
hoặc xem Tóm tắt