« Home « Kết quả tìm kiếm

Enhancement of de novo sequencing, assembly and annotation of the Mongolian gerbil genome with transcriptome sequencing and assembly from several different tissues


Tóm tắt Xem thử

- assembly and annotation of the Mongolian gerbil genome with transcriptome.
- Background: The Mongolian gerbil (Meriones unguiculatus) has historically been used as a model organism for the auditory and visual systems, stroke/ischemia, epilepsy and aging related research since 1935 when laboratory gerbils were separated from their wild counterparts.
- Results: The genome was sequenced using Illumina HiSeq 2000 and after assembly resulted in a final genome size of 2.54 Gbp with contig and scaffold N50 values of 31.4 Kbp and 500.0 Kbp, respectively.
- Based on the k-mer estimated genome size of 2.48 Gbp, the assembly appears to be complete.
- The genome annotation was supported by transcriptome data that identified 31,769 (>.
- 2000 bp) predicted protein-coding genes across 27 tissue samples.
- A BUSCO search of 3023 mammalian groups resulted in 86% of curated single copy orthologs present among predicted genes, indicating a high level of completeness of the genome..
- Conclusions: We report the first de novo assembly of the Mongolian gerbil genome enhanced by assembly of transcriptome data from several tissues.
- Sequencing of this genome and transcriptome increases the utility of the gerbil as a model organism, opening the availability of now widely used genetic tools..
- The Mongolian gerbil is a small rodent that is native to Mongolia, southern Russia, and northern China.
- Labora- tory gerbils used as model organisms originated from 20 founders captured in Mongolia in 1935 [1].
- Gerbils have been used as model organisms for sensory systems (vis- ual and auditory) and pathologies (aging, epilepsy, irrit- able bowel syndrome and stroke/ischemia).
- The gerbil’s.
- In addition to the auditory sys- tem, the gerbil has also been used as a model for the vis- ual system because gerbils are diurnal and therefore have more cone receptors than mice or rats making them a closer model to the human visual system [3]..
- The gerbil has also been used as a model for aging due to its ease of handling, prevalence of tumors, and experi- mental stroke manipulability [1, 4].
- Interestingly, the gerbil has been used as a model for stroke and ischemia due to variations in the blood supply to the brain due to an anatomical region known as the “Circle of Willis” [5]..
- 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0.
- which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
- The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated..
- Full list of author information is available at the end of the article.
- In addition, the gerbil is a model for epileptic activity as a result of its natural minor and major seizure propen- sity when exposed to novel stimuli [6, 7].
- Lastly, the ger- bil has been used as model for inflammatory bowel disease, colitis, and gastritis due to the similarity in the pathology of these diseases between humans and gerbils [8, 9].
- Despite its usefulness as a model for all these sys- tems and medical conditions, the utility of the gerbil as a model organism has been limited due to a lack of a se- quenced genome to manipulate.
- Here we describe a de novo assembly and annotation of the Mongolian gerbil genome and transcriptome.
- Re- cently, a separate group has sequenced the gerbil gen- ome, however our work is further supported by comparisons with an in-depth transcriptome analysis, which was not performed by the previous group [10]..
- RNA-seq data were produced from 27 tissues that were used in the genome annotation and deposited in the China National GeneBank CNSA repository under the project CNP0000340 and NCBI Bioproject # SRP198569, SRA887264, PRJNA543000.
- The genome annotation data is avail- able through Figshare, https://figshare.com/articles/.
- Mongolian_gerbil_genome_annotation/9978788.
- These data provide a draft genome sequence to facilitate the continued use of the Mongolian gerbil as a model organ- ism and to help broaden the genetic rodent models available to researchers..
- Insert library sequencing generated a total of 322.13 Gb in raw data, from which a total of 287.4 Gb of ‘clean’.
- Genome assembly.
- The gerbil genome was estimated to be approximately 2.48 Gbp using a k-mer-based approach.
- The final as- sembly had a total length of 2.54 Gb and was comprised of 31,769 scaffolds assembled from 114,522 contigs.
- Given the genome size estimate of 2.48 Gbp, genome coverage by the final as- sembly was likely complete and is consistent with the previously published gerbil genome, which had a total length of 2.62 Gbp [10].
- Completeness of the genome as- sembly was confirmed by successful mapping of the RNA-seq assembly back to the genome showing that.
- 98% of the RNA-seq sequences can be mapped to the genome with >.
- In addition, 91% of the RNA-seq sequences can be mapped to the genome with >.
- Gene expression data were produced to aid in the gen- ome annotation process.
- Transcriptome sequencing from the 27 tissues generated 131,845 sequences with a total length of bp.
- The RNA-seq assembly resulted in 19,737 protein-coding genes with a total length of 29.4 Mbp, which is available in the China Na- tional GeneBank CNSA repository, Accession ID:.
- The transcriptome data was also used to support the annotation and gene predictions as out- lined below in the methods section (Tables 5 and 6)..
- Genome annotation.
- Repeat element identification approaches resulted in a total length of 1016.7 Mbp of the total M.
- unguiculatus genome as repetitive, accounting for 40.0% of the entire genome assembly.
- A total of 22,998 protein-coding genes were predicted from the genome and transcriptome with an average transcript length of 23,846.58 bp.
- There was an average of 7.76 exons per gene with an average length of 197.9 Table 1 Global statistics of the Mongolian gerbil genome.
- Type Length (Kb) Percentage of the genome.
- bp and average intron length of 3300.83 bp (Table 5)..
- The 22,998 protein-coding genes were aligned to several protein databases, along with the RNA sequences, to identify their possible function, which resulted in 20,760 protein-coding genes that had a functional annotation, or 90.3% of the total gene set (Table 6).
- Annotation data is available through Figshare, https://figshare.com/arti- cles/Mongolian_gerbil_genome_annotation/9978788.
- In this study, we show a complete sequencing, assembly, and annotation of the Mongolian gerbil genome and transcriptome.
- This is not the first paper to sequence the Mongolian gerbil, however our results are consistent with theirs (similar genome size of 2.62 Gbp compared to our results of 2.54 Gbp) [10] and further enhanced by transcriptomic analysis.
- The gerbil genome consists of 40% repetitive sequences which is consistent with the mouse genome [11] and rat genomes [12.
- In addition to measuring standard assembly quality metrics, genome assembly and annotation quality were further assessed by comparison with closely related spe- cies, gene family construction, evaluation of housekeep- ing genes, and Benchmarking Universal Single-Copy Orthologs (BUSCO) search.
- Of the 3023 mammalian groups searched through BUSCO, 86%.
- complete BUSCO groups were detected in the final gene.
- The presence of 86% complete mammalian BUSCO gene groups suggests a high level of completeness of this gerbil genome assembly.
- A BUSCO search was also per- formed for the gerbil transcriptome data resulting in de- tection of 82% complete BUSCO groups in the final transcriptome dataset (Table 4).
- The CDS length in the gerbil genome was 1535, similar to mouse (1465) and rat (1337) (Table 5).
- The gerbil genome contained an aver- age of 7.76 exons per gene that were on average 197.9 in length, similar to mouse (8.02 exons per gene averaging 182.61 in length) and rat (7.42 exons per gene averaging.
- Table 3 Genome annotation comparisons with other model organisms.
- Protein coding genes.
- unguiculatus Mongolian gerbil.
- The average intron length in the gerbil genome was 3300.83, similar to the 3632.46 in mouse and 3455.8 in rat (Table 5).
- Based on the results from the quality metrics described above, we are confident of the quality of the data for this assembly of the gerbil genome and transcriptome..
- In summary, we report a fully annotated Mongolian gerbil genome sequence assembly enhanced by transcriptome data from several different gerbils and tissues.
- The gerbil genome and transcriptome add to the availability of alterna- tive rodent models that may be better models for diseases than rats or mice.
- Additionally, the gerbil is an interesting comparative rodent model to mouse and rat since it has many traits in common, but also differs in seizure suscepti- bility, low-frequency hearing, cone visual processing, stroke/ischemia susceptibility, gut disorders and aging.
- Se- quencing of the gerbil genome and transcriptome opens these areas to molecular manipulation in the gerbil and therefore better models for specific disease states..
- High-quality reads were used for genome assembly using the SOAPdenovo (version 2.04) package..
- The tissues were collected after the animals were euthanized with isoflurane (followed by decapitation) and stored on Table 4 Completeness of gerbil genome and transcriptome.
- Table 5 General statistics of predicted protein-coding genes.
- De novo SNAP .
- Quality of the RNA assembly was assessed by filtering RNA-seq reads using SOAPnuke (v1.5.2 parameters: “-l 10 -q 0.1 -p 50 -n 0.05 -t 5,5,5,5”) followed by mapping of clean reads to the assembled genome using HISAT2 (v2.0.4) and StringTie (v1.3.0).
- The initial assembled tran- scripts were then filtered using CD-HIT (v4.6.1) with se- quence identity threshold of 0.9 followed by a homology search (human, rat, mouse proteins) and TransDecoder (v2.0.1) open reading frame (ORF) prediction..
- Genomic repeat elements of the genome assembly were also identified and annotated using RepeatMasker (v4.0.5 RRID:SCR and RepBase library (v20.04) [15].
- after repetitive se- quences in the genome were masked using known repeat information detected by RepeatMasker and.
- Homology searching was performed using protein data from Homo sapiens (human), Mus musculus (mouse), and Rattus norvegicus (rat) from Ensembl (v80) aligned to the masked genome using BLAT.
- Using the InterProScan results, we obtained the annotations of the gene products from the Gene Ontology database..
- Genome assembly and annotation quality were further assessed by comparison with closely related species, gene family construction, evaluation of housekeeping genes, and Benchmarking Universal Single-Copy Orthologs (BUSCO) search.
- elieis/HKG/) and extracted corresponding protein se- quences to align to the gerbil genome using blastp (v.2.2.26).
- RNA- seq: High-throughput messenger RNA sequencing.
- Table 6 Functional annotation of the final gene set.
- SC, YF, YZ, WX, HW, XL, and XX performed the analysis and annotation of the genome and transcriptome.
- The funding body played no role in the design of the study and collection, analysis, and interpretation of data and in writing the manuscript.
- Genome annotation results are available at the China National GeneBank CNSA repository, Accession id: CNP0000340, and supporting materials, which include transcripts and genome assembly, are available under the same project (available upon acceptance of the manuscript).
- Genome annotation, https://figshare.com/articles/Mongolian_gerbil_.
- genome_annotation/9978788.
- The gerbil: a unique model for research on aging.
- Hearing sensitivity of the mongolian gerbil, Merionesunguiculatis.
- Cones in the retina of the Mongolian gerbil, Meriones unguiculatus: an immunocytochemical and electrophysiological study.
- The Mongolian gerbil in aging research.
- The Mongolian gerbil in experimental epilepsy.
- The Mongolian gerbil as a model for inflammatory bowel disease.
- De novo sequencing and initial annotation of the Mongolian gerbil (Meriones unguiculatus) genome.
- Genome sequence of the Brown Norway rat yields insights into mammalian evolution

Xem thử không khả dụng, vui lòng xem tại trang nguồn
hoặc xem Tóm tắt