« Home « Kết quả tìm kiếm

De novo assembly of the olive fruit fly (Bactrocera oleae) genome with linkedreads and long-read technologies minimizes gaps and provides exceptional Y chromosome assembly


Tóm tắt Xem thử

- De novo assembly of the olive fruit fly (Bactrocera oleae) genome with linked- reads and long-read technologies.
- Full list of author information is available at the end of the article.
- We provide an extensive RNA-seq data set, and genome annotation, critical towards gaining an insight into the biology of the olive fruit fly.
- In addition, elucidation of Y-chromosome sequences will advance our understanding of the Y- chromosome ’ s organization, function and evolution and is poised to provide avenues for sterile insect technique approaches..
- More important insect genomes, like that of the malaria mosquito Anopheles gambiae, followed soon after [2].
- The olive fruit fly belongs to the Tephritidae family of insects, a family that contains some of the most important agricultural pests world-wide, such as the Mediterranean fruit fly (medfly, Ceratitis capitata), the oriental fruit fly (Bactro- cera dorsalis), the Mexican fruit fly (Anastrepha ludens), the Australian Queensland fruit fly (Bactrocera tryoni) and others.
- Despite its economic importance in olive producing countries, several peculiarities of the olive fruit fly’s biology (e.g., difficulty in rearing, high natural homozygosity, lack of phenotypic mutations) made the development of classical genetics tools an impossible task.
- Another particularity of the olive fruit fly is the fact that it possesses a very small Y chromosome [30, 31], karyotyp- ically appearing as the ~ 4 Mb dot chromosome IV of D..
- For example, 80% of the Drosophila melanogaster Y chromo- some is made up of repeats [33].
- of them are characterized by the presence of small exons, gigantic introns, and very little conservation among species even of the same family [35].
- The M factor is the initial switch of the sex- determining cascade in tephritids, a switch that has been speculated to differ from the one used by the model dipteran Drosophila (for a review see [37.
- The M factor has recently been identified in the medfly and a few other tephritids, including the olive fruit fly [38], but the details of the sex determination cascade remain unclear.
- Here, we describe the whole genome sequence of the olive fruit fly, generated as a hybrid assembly using the 10x Genomics linked-reads assembly as the backbone followed by scaffolding and gap-filling with Illumina mate-pair reads, and long-reads from PacBio and ONT..
- This genome has a scaffold N50 of 4.69 Mb and L50 of 30 making it one of the most contiguous Tephritidae genomes in the current NCBI genome catalogue.
- We also identified Y chromosome-specific scaffolds and present the first assembly of the B.
- oleae Y chromosome that will be instrumental in the elucidation of the regulation of the M factor and the structure and evolution of the entire Y chromosome.
- In order to generate a high-quality genome assembly of the olive fruit fly we undertook a multistep process that consisted of different sequencing and assembly approaches (Fig.
- The final steps involved scaffolding and gap-closing of the 10x assembly using mate-pair and long-reads and then finally polishing to generate the final assembly (GenBank accession GCA .
- ‘Demokritos’ strain of the olive fruit fly which has been a.
- capitata genome [44] that required inbreeding of the ISPRA strain for 20 genera- tions which resulted in low heterozygosity (0.391.
- 45) and, probably, other reasons that have to do with the biology of the insect (e.g., strict monophagy of the larva)..
- 1 Schematic of the method used to generate the different assemblies.
- N50 value is the scaffold/contig length at which half of the genome is contained in scaffolds/contigs at or above that length.
- Scaffolding and gap-closing of the linked-reads assembly We explored the effectiveness of combining the 10x- only assembly with short-reads and long-reads.
- The short and long reads enabled scaffolding and gap-closing of the 10x- only assembly (see Supplementary Table S3 and Supplementary Figure S3 for a summary of the results).
- NG50 value is the scaffold/contig length at which half of the genome.
- Using the repeat masked version of the final assembly (GCA which was generated from male olive fruit fly DNA, male and female short Illumina reads (40X coverage of each) were independently mapped.
- When a primer pair resulted in the amplification of the expected size band with male genomic DNA only, we concluded that its corresponding scaffold was Y-specific.
- The olive fruit fly has well-characterized cytogenetic maps derived from polytene chromosomes [52], which enables the determination of the exact position of scaf- folds containing specific markers.
- As part of the current work we generated 9 new markers (Supplementary Table S6) and mapped their position on salivary gland polytene chromosomes by in situ hybridization (Fig.
- The remaining 54 tags allowed the physical mapping of 36 contigs with a total length of 200 Mb, corresponding to 41% of the total genome size (Fig.
- Addition of the X and Y chromosome scaffolds totaling 6 and 4 Mb,.
- The approximate location of the PCR primer on the scaffold/contig is shown in pink.
- respectively, that were identified using the CQ method brought the total percentage of the genome assigned to chromosomes to 43% (Fig.
- Eukaryota, Arthropoda, Insecta, and Diptera and 98.1% of the genes surveyed were captured in the GCA assembly (Sup- plementary Figure S9).
- Most of the sequences identi- fied (84.6%) had a length of 100 to 2500 bp.
- It’s worth noting that no sequences of the olive fruit fly symbiont Candidatus Erwinia dacicola were identified which confirms previous reports that this symbiont was lost upon the laboratory domestication and the artificial rearing of this insect pest species [60].
- PASTEC classification of the repeat library (Table 2) showed that Class II TEs were most numerous of all repeat elements (45.
- of the B.
- capitata, TE constitute 18% of the genome [44].
- In terms of genome coverage, Class II DNA transposons accounted for 16.15% of the genome while Class I retro- transposons accounted for 10% of the genome.
- oleae TE down to super- family level using TEannot (Supplementary Table S8) but only 5% of the genome was annotated.
- We performed extensive RNA sequencing of the olive fruit fly.
- RNA-seq data was collected from these tissues and stages since they were used to address other important questions of the B..
- The completeness of the assembly was evaluated by querying Arthropoda, Insecta, and Diptera Basic Universal Single Copy Orthologs (BUSCOs) in the assembly of which 99, 98.4, and 94.8% are present as complete (Sup- plementary Table S11) suggesting that the transcriptome captured most genes.
- A more comprehensive protein coding gene-prediction pipeline, JAMg [78], was used to derive a more complete transcriptome of the olive fruit fly, integrating the RNA- seq datasets as a source of evidence.
- To determine the completeness of the JAMg transcrip- tome, Diptera BUSCOs were searched.
- Of the 2799 BUSCOs were captured.
- Out of the 16,455 genes had significant hits.
- We used Prot-SpaM [83] to infer pair- wise distances of the 19 species.
- A total of 1395 orthogroups were identified that contain a single protein from each of the 6 species and another 7286 orthogroups that had one or more` protein from each species.
- Whole proteomes were used to infer pairwise distances of the 19 species using Prot-SpaM [84].
- See Supplementary Table S16 for sources of the proteomes used.
- We provide a list of the enriched processes in each stage in Supplementary Table S14..
- is one of the most highly contiguous Tephritidae assembly in the NCBI catalogue (see Supple- mentary Figure S18 for some comparisons).
- We were able to achieve this because the laboratory strain of the olive fruit fly used for genome sequencing has low heterozygosity due to (i) low level of natural polymorph- ism, (ii) significant bottleneck during colonization, and (iii) long period of laboratory rearing (over 45 years) without any admixture.
- Gene expression (transcripts per million) was calculated for each of the stages.
- Coordinates of the individual genes on the first 2 principle components (circle of correlation) are shown as black dots.
- The assembly presented here will enormously boost the understanding of the olive fruit fly’s biology and genome.
- The highly repetitive nature of the Y chromosome makes it the most challenging to assemble in genome sequencing efforts.
- Gene expression (transcripts per million) was calculated for each of the 4 metamorphotic stages.
- Breeding of the insects.
- To generate an ONT based assembly, ONT sequence reads were used for de novo genome assembly of the olive fly using Canu [121]..
- of the olive fruit fly to reference genomes of Wolbachia, Spiroplasma, and Cardinium using MIRA v4.0 and bowtie2.
- These sequences were used as a custom BLAST database in order to identify bacterial sequences that have been filtered into the assembly of the B.
- We aligned 40x coverage of male and female reads to a hard-masked version of the assembly and for each set, we calculated the depth at each base for all scaffolds..
- The recombinant plasmid DNA was finally isolated with the use of the Promega Wizard Plus Minipreps DNA Purification System according to the supplier’s instructions..
- Principle component analysis and hierarchical clustering Gene expression (transcripts per million, TPM) was calculated for each of the 4 metamorphotic stages.
- Clusters of Genes in clusters that peak at either of the 4 metamorphotic stages.
- Comparison of assembly quality of the 3 main assemblies.
- Comparison of assembly quality of the 3 Y chromosome assemblies..
- Sequences of the primers used for the validation of Y- chromosome specific scaffolds.
- Distribution of the B.
- Summary of the Trinity transcriptome generated from sequencing all the tissues in.
- Assessment of the completeness of the Trinity de novo transcriptome assembly.
- Schematic of the method used to generate the main assembly reported.
- Schematic of the PiRATE pipeline.
- N50: Scaffold/contig length at which 50% of the total genome length is contained in scaffolds/contigs of that size or longer when all scaffolds/.
- All authors reviewed and approved the final version of the manuscript..
- Part of the sequencing cost of this research was also supported by the “ ARISTEIA ” (MIS- 524938) Action of the “ Operational programme Education and Lifelong Learning.
- Further support was provided by the two postgraduate programs of the Department of Biochemistry and Biotechnology of the University of Thessaly.
- this research is co-financed by Greece and the European Union (European Social Fund-ESF) through the Operational Programme «Human Resources Development, Education and Lifelong Learning» in the context of the project.
- The genome sequence of the malaria mosquito Anopheles gambiae.
- Genome assembly and annotation of the Trichoplusia ni Tni-FNL insect cell line enabled by long-read technologies.
- Analysis of the olive fruit Fly Bactrocera oleae Transcriptome and phylogenetic classification of the major detoxification gene families.
- The molecular biology of the olive fly comes of age.
- Interchromosomal duplications on the Bactrocera oleae Y chromosome imply a distinct evolutionary origin of the sex chromosomes compared to Drosophila.
- Identification of the sex-determining region of the Ceratitis capitata Y chromosome by deletion mapping.
- The whole genome sequence of the Mediterranean fruit fly, Ceratitis capitata (Wiedemann), reveals insights into the biology and adaptive evolution of a highly invasive pest species.
- The Bactrocera oleae genome: localization of nine genes on the polytene chromosomes of the olive fruit fly (Diptera: Tephritidae).
- Acetobacter tropicalis is a major symbiont of the olive fruit fly (Bactrocera oleae).
- The transposable elements of the Drosophila melanogaster euchromatin: a genomics perspective.
- Assembly of the Tc1 and mariner transposition initiation complexes depends on the origins of their transposase DNA binding domains.
- The mitochondrial genome of the olive fly Bactrocera oleae: two haplotypes from distant geographical locations..
- Embryonic development of the olive fruit fly, Bactrocera oleae Rossi (Diptera: Tephritidae), in vivo.
- Comprehensive mapping of long-range interactions reveals folding principles of the human genome.
- Dovetail Genomics: Overview of the Dovetail ™ De Novo Assembly Process [https://dovetailgenomics.com/ga_tech_overview/.
- Cytological evidence on the phylogeny and classification of the Diptera.
- Identification of Y-chromosome scaffolds of the Queensland fruit fly reveals a duplicated gyf gene paralogue common to many Bactrocera pest species.
- Heterochromatin-Enriched Assemblies Reveal the Sequence and Organization of the <em>Drosophila

Xem thử không khả dụng, vui lòng xem tại trang nguồn
hoặc xem Tóm tắt