« Home « Kết quả tìm kiếm

Correction for both common and rare cell types in blood is important to identify genes that correlate with age


Tóm tắt Xem thử

- Whole blood gene expression signatures have been associated with aging and have been used to gain information on its biological mechanisms, which are still not fully understood.
- As a result, previously observed associations between gene expression levels and aging might be driven by cell type composition rather than intracellular aging mechanisms.
- Both models were applied to whole blood gene expression data from 3165 individuals belonging to the general population (age range of 18 – 81 years).
- We evaluated that the new model is a better fit for the data and it identified fewer genes associated with aging (625, compared to the 2808 of the initial model.
- 18% of the 2808 genes identified by the initial model) were found using both models, indicating that the other previously reported genes could be proxies for less abundant cell types.
- In particular, functional enrichment of the genes identified by the new model highlighted pathways and GO terms specifically associated with platelet activity..
- Conclusions: We conclude that gene expression analyses in blood strongly benefit from correction for both common and rare blood cell types, and recommend using blood-cell count estimates as standard covariates when studying whole blood gene expression..
- Full list of author information is available at the end of the article.
- Aging, defined as a time-dependent process characterized by physical and cognitive decline, is one of the main risk factors for autoimmune diseases, neurodegenerative dis- eases, cancer and diabetes [1, 2].
- To better understand this process on a molecular level, changes in gene expression during aging have been previously studied in whole blood [3, 4].
- Since the propor- tions of these cell populations vary with age [6–9], it is ne- cessary to correct for cell counts when using gene expression from blood.
- Indeed, uncorrected gene expres- sion data from whole blood has been shown before to be biased by the gene expression pattern of the most abun- dant cell type at the moment of sampling [10]..
- Improved cell correction is necessary to identify cell- independent gene expression patterns.
- We performed an association of gene expression changes with age using data from four Dutch cohorts (Table S1).
- Gene expression was related to age and selected covariates depending on the regression model applied (Initial or Extended).
- Genes significantly associated with age were retrieved by applying Bonferroni correction ( P and gene lists obtained were compared to establish the efficiency of the models and analyzed to get insights on the process of aging.
- This decrease was expected, as many of the results from the IM may have been driven by the composition of less prominent cell types that were included in our EM..
- indicates that correcting for cell populations identifies common whole blood gene expression patterns..
- To this end, we analyzed the mean squared errors (MSE), the distribution of gene expression residuals and their homoscedasticity after ap- plying the IM and EM.
- As expected, MSE values of the regressions de- creased when applying the EM (total EM median MSE value: 0.267, total IM median MSE value: 0.334) (Fig.
- We observed that the absolute correlations were smallest in the EM model (EM median value: 9⨯10 −3.
- Adding cell counts clearly improves the prediction of gene expression values.
- Single-cell RNA-seq data reveals the contribution of cell types to gene expression during aging.
- Every cell type has its own gene expression pattern, so the composition of blood cells influences the total gene expression observed in whole blood RNA-seq data.
- As shown in the t- SNE plots (Fig.
- 3A and B), aging-related genes retrieved through the IM have a propensity to be expressed in specific parts of the t-SNE plot that match with cell types, while EM genes maintain a lower and more stable expression across cell types from donors with a wide age range (Wilcoxon test, P Fig.
- In addition, we observed that the mean expres- sion range for the EM genes was always larger, highlight- ing a higher gene expression variation (mean expression.
- 2 Gene expression residuals decrease with the EM.
- distributions of gene expression residuals are shown for all genes in (C) and for the shared genes significantly associated to aging in (D), after applying the IM and EM models.
- Homoscedasticity was evaluated by correlating gene expression residuals from every model with age, and the absolute Spearman ⍴ values obtained after meta-analysis are reported for all genes (E) and the shared genes significantly associated with aging (F).
- b) The mean expression value of aging-related genes from the initial model (IM, left) and the extended model (EM, right) is plotted in the single cells.
- c) The age distribution of the scRNA-seq donors.
- In d), the distributions of the mean gene expression for IM- and EM-related genes across every cell in the t-SNE plots from b) are reported.
- e) The distributions of the coefficient of variation are presented for both the IM and EM..
- As 82% of the EM genes were also present in the IM list, we expected comparable func- tional enrichments.
- For example, changes in GO biological processes ascribable to the regulation of gene expression were downregulated (e.g.
- 4 Heatmap of gene expression residuals correlations for EM upregulated aging-related genes.
- Heatmap showing upregulated EM aging- related genes clustered based on the paired correlations of their gene expression residuals.
- the gene expression residuals.
- While most clusters did not show a clear enrichment, cluster 1 of the upregulated EM aging-related genes (Fig.
- The EM enrichment result of platelet activity was independent of the measured number of platelets.
- The correlations between gene expression levels of the genes from cluster 1 that contribute to the enrichment and measured platelet levels are very significant (Fig..
- S7A), but they disappear when we compare the re- sidual gene expression from the EM with such plate- let counts (Fig.
- In particular, all the EM-related genes identified as aging genes in GenAge were found to be a subset of the IM-related genes, with the exception of EMD coding for emerin.
- This result highlights once more the filtering properties of the EM model, and further suggests its ability in mak- ing aging information stand out..
- This extended cell correction enabled us to calibrate gene expression according to the number of blood cells and extract an aging gene expression pat- tern that was less influenced by cell quantity compared.
- The rationale be- hind the method we propose is that both variations in organismal cell composition and gene expression influ- ence the processes of aging and diseases, and that cell correction enables to filter out the expression of specific cell biomarkers while aiming at retaining those gene ex- pression patterns that capture the main and shared aging processes in the whole tissue.
- The EM outperformed the old model, IM, when analyzing the MSE, normality of residuals and homoscedasticity, highlighting that an increased cell correction results in a more accurate gene expression estimation during aging..
- For this purpose, we calculated per cell type the mean gene expression of both IM and EM genes using scRNA-seq data from ~ 25,000 blood mono- nuclear cells of 45 donors [20].
- The EM aging-related genes had lower mean gene expression levels, fewer cell type specific marker genes and those markers that were present were less abundantly expressed (Fig.
- Although many of the EM genes were also identified using the IM, the enrichments were often not overlapping suggesting an increased precision in evaluat- ing the relation between gene expression and age.
- We clustered the EM genes based on gene expression residuals and again found the strongest en- richment in the upregulation of platelet activity..
- We also show that there is no residual relationship between platelet counts and gene expression after correction (Fig.
- in the previous study [3]: PF4 not tested, PPBP nominally significant).
- Although these results may arise from differences in sample sizes or models used, this observation coupled with the fact that older individ- uals have higher levels of PF4 and PPBP protein in their plasma indicates that platelets become more active with age as reflected both in gene expression levels and pro- tein abundance in plasma [22]..
- and other gene expression association studies.
- In sum- mary, we hypothesize that the platelet enrichment ob- served in the EM aging-related genes represents one of the molecular signatures of aging.
- None of the cohorts use disease as a selection criterion.
- LL participants are all from the Northern three provinces of the Netherlands, LLS in- cludes the offspring and partners of long-lived individ- uals, NTR studies twins and their relatives and RS participants are all over 45 years old.
- All cohorts followed similar protocols for genotyping and gene ex- pression as part of the BIOS Consortium, an initiative of the Biobanking and Biomolecular Resources Research Infrastructure - The Netherlands [35]..
- Gene expression.
- Gene expression data was obtained using the same protocol across all studies, as previously described [36]..
- 1%) SNPs in the Genome of the Netherlands [37].
- Prior to normalization, population outliers were removed based on a plot of the first two principal components, calculated on non- imputed genotypes.
- The first step in the normalization procedure was the application of the trimmed mean of M-values normalization method [38].
- Next, we removed genes with no variance, log 2 transformed the expression matrix and Z-transformed by centering and scaling of the genes, following a previously published protocol de- scribed in detail in the online cookbook [39]..
- with y being gene expression levels for every gene, i the number of cohort samples, age (x i1 ) in years at time of blood sampling, and the following additional variables being the other covariates, including cell counts (for a total of p predictors).
- Both the IM and EM were tested on 19,932 genes that showed expression in blood of at least 0.5 counts per million in at least 1% of the samples [43].
- Evaluation of the regression models.
- To evaluate the performance of the regression models, we used gene expression residuals and investigated MSE values, distribution of residuals and homoscedasticity..
- Regarding homoscedasticity, meta-analysis was conducted on cohort-related, gene- specific Spearman ρ values (rho values) obtained by cor- relating age with the gene expression residuals, calcu- lated from the application of the IM and EM.
- Collection and normalization of the data has been described previously [20].
- Within these cell types, we calculated the mean ex- pression of the genes significantly associated with aging identified by the IM and the EM, and represented their expression in t-SNE plots.
- We then identified genes that we considered markers for each of the 11 cell types using the function `FindMarkers.
- The codes of each subtype used in the EM model are reported..
- Mean squared errors, Pearson correlation coefficient (r) and Spearman correlation coefficient (rho) of residual gene expression with age.
- Overlap of IM- and EM-related genes with the known aging-related genes in the GenAge database..
- A Pearson correlation of the Z-scores associated with both significant and not significant IM and EM genes is shown.
- B) Gene expression residuals decrease with the EM.
- Homoscedasticity was evaluated by correlating gene expression residuals from every model with age, and the absolute Spearman values obtained after meta- analysis are reported for all genes minus the shared genes significantly associated with aging.
- Mean expression levels of cell type marker genes among aging-related genes identified in the Initial Model (IM, left) and in the Ex- tended Model (EM, right) are plotted.
- Downregulated EM aging-related genes were clustered based on the cor- relations of gene expression residuals and highly correlating clusters were identified and highlighted with a yellow border.
- Correlations between (residual) gene expression levels of the genes from platelet-related cluster 1 and measured platelet levels.
- A) Spearman corre- lations between gene expression levels and measured platelets, B) Spear- man correlations between gene expression residuals from the extended model and measured platelets.
- DIB, MAI, PES have contributed to the acquisition of the data and revised the manuscript.
- HJW contributed to the design and interpretation of the work and substantively revised it.
- revision of the work.
- Information on the members of the BIOS consortium, their role in the consortium and their affiliation is described as follows:.
- Raw RNA-seq data of the cohorts analyzed in this study can be obtained from the Euro- pean Genome-Phenome Archive (study accession EGAS dataset accession EGAD https://www.ebi.ac.uk/ega - contact person:.
- Written informed consent was obtained previously from all participants of the LL, LLS, NTR and RS biobanks in accordance with the ethical and institutional regulations.
- The LL study was approved by the Medical Ethics committee of the University Medical Centre Groningen (METc UMCG) document number METC UMCG LLDEEP: M .
- The study protocol for LLS was approved by the Medical Ethical committee of the Leiden University Medical Center (METC-LDD) before the start of the study [12].
- The NTR study protocol was approved by Central Ethics Committee on Research Involving Human Subjects of the VU University Medical Center (CCMO), Amsterdam, an Institutional Review Board certified by the US Office of Human Research Protections (IRB number IRB2991 under Federal-wide Assurance-3703.
- The RS was approved by the Medical Ethics Committee of the Erasmus MC (Erasmus MC MERC, registration number MEC 02.1015) and by the Dutch Ministry of Health, Wel- fare and Sport (Population Screening Act WBO, license number PG) [14].
- Therefore, no add- itional WMO approval of the study was required..
- Whole blood gene expression associated with clinical biological age.
- Cell-type specific gene expression profiles of leukocytes in human peripheral blood.
- Exploratory differential gene expression analysis in microarray experiments with no or limited replication.
- Whole-genome sequence variation, population structure and demographic history of the Dutch population.
- Spatial reconstruction of single-cell gene expression data

Xem thử không khả dụng, vui lòng xem tại trang nguồn
hoặc xem Tóm tắt