« Home « Kết quả tìm kiếm

Study of spontaneous mutations in the transmission of poplar chloroplast genomes from mother to offspring


Tóm tắt Xem thử

- Each of the three resulting chloroplast assemblies contained contigs covering >.
- However, only 1 of the 94 loci was a missense mutation, which was located in the exon region of rpoC1 encoding the β’ subunit of plastid-encoded RNA polymerase.
- The genotype of the loci in NL895 and its female parent (I69) was different from that of its male parent (I45)..
- Because of the advances in next-generation sequencing (NGS), in terms of time and cost, and the increases in the number of available chloroplast genomes, whole-genome shotgun sequencing based on NGS technology is increasingly used to con- struct plant chloroplast genomes, such as Brassica rapa and Raphanus sativus [8, 9]..
- The high-quality chloroplast genomes of the three poplar clones were constructed by a combination of eight.
- Feasibility evaluation of the reference-assisted strategy In this study, we used the hybrid strategy of both reference-assisted and de novo assembly to isolate and construct a complete chloroplast genome.
- Evaluating the applicability of the reference-assisted strategy to the Populus cpDNAs’ assembly is required before being used.
- The highly conserved features of the chloroplast genomes are well-known.
- trichocarpa cpDNA (95%) as a reference was slightly higher than those of the other five Populus cpDNAs as references.
- Assembly of the simulated reads.
- To calculate the effective cost of whole-genome shotgun sequencing, it is necessary to estimate the amounts of short reads required for the generation of a high-quality assembly of the chloroplast genome.
- To assess the effects of the two factors on the chloroplast genome assembly, 12 short-read datasets for all combinations of the three different read lengths (60, 80 and 100 bp) with four dif- ferent read amounts—10 k (10 4.
- The starts and ends of the aligned blocks are labeled with transparent red points.
- including 60 bp–100 k, 80 bp–100 k, 100 bp–100 k and 60 bp–1 M, were the nearest to the genome size of the ref- erence cpDNA (P.
- The de novo assembly of each of the 12 read datasets was performed using Minia with five k-mer values of and 59.
- Time consumption and the quality of the de novo gen- ome assembly are the two most important aspects of the cpDNA assembly.
- To determine which level of read amount was different from the others, pairwise comparisons of the assembly time among the four read amounts were performed using.
- The running times for the 10-k and 100-k pair-read datasets were significantly different from those of the 1-M and 10-M pair reads (Fig.
- However, a large proportion of contigs for each of the 13 assemblies were far <.
- The genome fraction rates of the 13 assemblies after.
- The x -axis and y -axis represent the assembly times (in seconds) of the simulated short reads before and after which PCR duplicates were filtered from raw data, respectively.
- b Boxplot of the running times for assembling four simulated reads data sets and 10 7 pairs of short reads).
- The IRb region was missing in the resulting assemblies because of the nearly 100% se- quence identity between IRa and IRb on the P.
- Proportion of the chloroplast reads.
- The ratio of cpDNA reads to whole-genome sequencing reads is another key factor that contributes to the esti- mation of the amount of whole-genome shotgun reads required for the construction of the Populus chloroplast.
- The chloroplast reads were isolated from all of the whole-genome short reads for the I69, I45 and NL895 clones, by aligning reads against the P.
- Each of the 47 real datasets of whole-genome shotgun sequencing had chloroplast reads of more than 10 5 (100 k) pairs.
- We employed four different parameter strategies to assemble a chloroplast genome from each of the 47 chloroplast read datasets, which were isolated from whole-genome sequencing reads from the three poplar clones I45, I69 and NL895.
- trichocarpa chloroplast genome covered by the assemblies of the simulated short reads under multiple k-mer values.
- The quadrant structure of the chloroplast genome is composed of large single-copy (LSC) and small single-copy (SSC) regions separated by a pair of inverted repeats (IRa and IRb).
- The x -axis and y -axis represent the genome assemblies for the simulated reads data and the locations of the reference genome covered by contigs from the genome assemblies, respectively.
- The cpDNA genome sizes of the three poplar clones (I45, I69 and NL895) were estimated from the all 47 datasets based on the k-mer distribution before the de novo assembly of the poplar chloroplast genome.
- The genome size estimates of all 47 datasets were very close to 129 kbp, which was approximately equal to the total bases of the LSC, IRa and SSC regions of the Populus chloroplast genome.
- This was in accordance with the results of the previously performed simulated reads ana- lysis.
- multivariate ANOVA to analyze each of the assemblies generated by the six assemblers.
- The running times of the six assemblers (p <.
- 0.001), except SGA (p = 0.07), were significantly different for multiple values of the corresponding parameters.
- The assembly times of the four k-mer-based assemblers (ABySS, Minia, SOAPdenovo and Velvet) increased broadly with the decreasing k-mer values.
- Thus, the running times of short-read assemblies for constructing chloroplast genomes were mainly under the significant influence of the three factors, including data (reads) volume, assembly tools and assembly parameters.
- Add- itionally, the running times for the SGA were longer than those of the other seven assemblers, based on the results from multiple comparison tests..
- 100, which were selected as a preliminary selection criteria on the basis of the previous simulated reads analysis.
- SGA was the least successful assembler, producing eligible assemblies from only 2 of the 47 datasets..
- For the assemblies of the IDBA, Minia, SOAPdenovo, SGA and Velvet five assemblers, no significant differences were shown in the CoverRatio and totalSum among the different parameters.
- However, the CoverRatio and totalSum of the ABySS assemblies with certain k-mers were relatively higher than those of the other k-mers.
- Of the assemblies generated by IDBA, Minia, SOAPdenovo, SGA and Velvet assemblers, the assemblies with the greatest N50 values were selected from the assemblies using the same dataset under different parameters as the optimum cpDNA assembly..
- The N50 values of the Edena, Minia and Velvet assemblies were far less than those of the other five tools.
- The totalSum (total base) values of the Velvet assemblies were much greater than those of the other tools, being at least 225 kbp, which.
- greatly deviated from the 157 kbp of the Populus refer- ence chloroplast genome.
- The CoverRatio values of the SGA and SOAPdenovo assemblies were less than 80.6%..
- Thus, the Edena, Minia, SGA, SOAPdenovo and Velvet assemblies were inferior to those of the remaining three assemblers, ABySS, IDBA and SPAdes.
- The IR sequence was easily lost in the de novo assembly of the entire Populus chloroplast genome owing to the high similarity between IRa and IRb.
- The genome size estimates of the simulated data and real data in the study were approximately 129 kbp, covering the LSC + IRa + SSC regions, and almost equal to those of the de novo assemblies.
- To further improve the quality levels of the cpDNA assemblies from the three poplar clones, we adopted the strategy of merging multiple assemblies into one to reduce errors and to extend contig lengths.
- All three super contigs were aligned to 1–129,435 bp of the P.
- The acquisition of the contigs span- ning the IRb region was undertaken in the three steps..
- represents the correlation matrix of the 11 metrics used for assessing poplar chloroplast genome assembly.
- totalSum ’ (the total base of contigs.
- mean ’ (the mean length of contigs.
- In addition, the partial sequence of the IRb region in the I45 and I69 cpDNA was the complete reverse complement of its counterpart in the IRa region.
- Comparison of the chloroplast genomes.
- The vast majority of SNP loci (181) were located in the LSC and SSC regions of the cpDNAs, but only 30 SNPs occurred in the IRa and IRb regions.
- Only 12 of the 130 InDels were consistent among the three poplar clones and located in the pair of IR regions of the cpDNAs..
- All of the variants that were inconsistent among I45, I69 and NL895 were located in the LSC (93) and SSC (4) regions, but not in the IRa and IRb regions.
- The SNPs for each dataset of the three poplar clones I45, I69 and NL895 were called by read alignment to the P.
- predicted reference alleles of the three poplar clones were different.
- In total, 299 variant loci, approximately three-quarters of all the loci, were located in the LSC region of the poplar chloroplast genome.
- Of the 307 identical variants, 42 were situated in the IR pair (IRa and IRb) of the poplar chloroplast genome..
- None of the variant loci with different genotypes were located in the IRa and IRb regions.
- The overall GC content of the IRa and IRb (41.97%) regions in the poplar cpDNAs was more than those of the LSC (34.47%) and SSC (30.54%) regions..
- Of the 307 variant loci having the same genotype among the three poplar clones, most were located in the non-coding regions, such as introns, and upstream and downstream sequences, of the poplar chloroplast.
- Only 65 loci with identical genotypes, 30 silent (synonymous) and 35 missense mutations, were located in the coding regions or exons of the genes.
- For example, nine of these loci were located in the exon of the RNA polymerase (rpo) gene family, consisting of rpoA, rpoB, rpoC1 and rpoC2, in the poplar chloroplast genome.
- The vast majority of the other 94 variations with differences in genotypes between the three poplar clones were located in the non-coding region of the poplar chloroplast genome.
- The genotypes of the loci identified by genome sequence comparison strategy were consistent with those identified by the read mapping strategy.
- Thus, the offspring cpDNAs were derived from the female parent in species of the genus Populus..
- A high-quality assembly of the chloroplast genome is essential for com- paring chloroplast genomes between closely related species or individuals, especially for genetic relationships between parent and offspring or between full-sibs.
- The length of the reads generated on Illumina sequencing platforms appeared to not affect the assembly of the chloroplast genomes for the three poplar clones, and this was supported by the resulting assem- blies of both the simulated Illumina-like reads and the real Illumina reads.
- The cpDNA ratios of the three poplar clones, NL895, I45 and I69, were and 8.82%, respectively.
- The cpDNA ratios of the Sanger sequencing data for two rice cultivars (PA64S and 93–11) are less than 2.3% [12].
- Compared with the other five assemblers, ABySS, IDBA and SPAdes provided relatively superior results in terms of the metrics used in the assessment of chloroplast genome assemblies.
- We applied the contig integrator CISA to merge de novo assemblies of multiple datasets for each of the clones..
- The quality levels of these cpDNAs were slightly improved, particularly in the accuracy and continuity of the resulting assemblies..
- trichocarpa cpDNA as a reference to extract almost all of the chloroplast reads from the whole-genome shotgun reads of the three clones.
- Nevertheless, the construction of the chloroplast genomes for the three poplar clones revealed that it was still difficult to construct high-quality poplar chloroplast genomes using only the isolated cpDNA reads from single-pass sequencing data..
- In the study, we used an entire chloroplast genome com- parison to preliminarily determine the pattern of the spontaneous mutations in the chloroplast genome of a poplar hybrid F 1 generation.
- The six poplar species belong to four sections of the genus Populus, sect.
- Mapping the short reads against cpDNAs for each of the six Populus species was performed using BWA (v0.7.12) and Bowtie (v1.1.1) with default parameters [23, 24]..
- To evaluate the feasibility of the reference-assisted strategy for the poplar chloroplast genome assembly, we utilized the reads simulator wgsim (v0.3.0, https://.
- The duplicates of the simulated paired reads were identified and discarded with FastUniq (v1.1) [25].
- and NL895, which is a genotype from the F 1 progeny of the interspecific hybrids between I69 (female parent) and I45 (male parent).
- The whole-genome shotgun sequencing of the three Populus clones was performed at the Chinese National Human Genome Center at Shanghai (http://www.chgc.sh.cn/)..
- trichocarpa chloroplast genome using both Bowtie and BWA were screened and integrated as the cpDNA reads for the chloroplast gen- ome assembly of the three poplar clones.
- Frist, a de novo assembly strategy was employed to make the initial assembly of the selected cpDNA reads.
- A visual inspection of the cpDNA assemblies was conducted on the integrative genomics viewer IGV (v .
- The quality of the constructed cpDNAs for the three clones was assessed by QUAST with the P.
- BLAST algorithm-based searches against annotated genes from six poplar refer- ence cpDNAs were performed to detect potential genes in the cpDNAs of the three clones.
- SNPs and InDels (Insertions and Deletions) were detected directly from each of the 47 datasets by SAM- tools (v1.2) and BCFtools (v .
- The annotation of the called variant loci was performed using SnpEff with the P.
- De novo assembly and characterization of the complete chloroplast genome of radish (Raphanus sativus L.
- Estimation of the spontaneous mutation rate per nucleotide site in a Drosophila melanogaster full-sib family

Xem thử không khả dụng, vui lòng xem tại trang nguồn
hoặc xem Tóm tắt