As the light sensing part of the visual system, the human retina is composed of five classes of neuron, including photoreceptors, horizontal cells, amacrine, bipolar, and retinal ganglion cells. Each class of neuron can be further classified into subgroups with the abundance varying three orders of magnitude. Therefore, to capture all cell types in the retina and generate a complete single cell reference atlas, it is essential to scale up from currently published single cell profiling studies to improve the sensitivity. In addition, to gain a better understanding of gene regulation at single cell level, it is important to include sufficient scATAC-seq data in the reference. To fill the gap, we performed snRNA-seq and snATAC-seq for the retina from healthy donors. To further increase the size of the dataset, we then collected and incorporated publicly available datasets. All data underwent a unified preprocessing pipeline and data integration. Multiple integration methods were benchmarked by scIB, and scVI was chosen. To harness the power of multiomics, snATAC-seq datasets were also preprocessed, and scGlue was used to generate co-embeddings between snRNA-seq and snATAC-seq cells. To facilitate the public use of references, we employ CELLxGENE and UCSC Cell Browser for visualization. By combining previously published and newly generated datasets, a single cell atlas of the human retina that is composed of 2.5 million single cells from 48 donors has been generated. As a result, over 90 distinct cell types are identified based on the transcriptomics profile with the rarest cell type accounting for about 0.01% of the cell population. In addition, open chromatin profiling has been generated for over 400K nuclei via single nuclei ATAC-seq, allowing systematic characterization of cis-regulatory elements for individual cell type. Integrative analysis reveals intriguing differences in the transcriptome, chromatin landscape, and gene regulatory network among cell class, subgroup, and type. In addition, changes in cell proportion, gene expression and chromatin openness have been observed between different gender and over age. Accessible through interactive browsers, this study represents the most comprehensive reference cell atlas of the human retina to date. As part of the human cell atlas project, this resource lays the foundation for further research in understanding retina biology and diseases.
Datasets (6)
snRNA-seq of human retina - all cells3,177,310 cells
snRNA-seq of human retina - retinal ganglion cell subset399,605 cells
snRNA-seq of human retina - bipolar cell subset691,008 cells
snRNA-seq of human retina - amacrine cell subset571,579 cells
Single cell atlas of the human retina - Bipolar cell subclass - scRNA-seq72,788 cells
Single cell atlas of the human retina - all cells - scRNA-seq265,767 cells
As the light sensing part of the visual system, the human retina is composed of five classes of neuron, including photoreceptors, horizontal cells, amacrine, bipolar, and retinal ganglion cells. Each class of neuron can be further classified into subgroups with the abundance varying three orders of magnitude. Therefore, to capture all cell types in the retina and generate a complete single cell reference atlas, it is essential to scale up from currently published single cell profiling studies to improve the sensitivity. In addition, to gain a better understanding of gene regulation at single cell level, it is important to include sufficient scATAC-seq data in the reference. To fill the gap, we performed snRNA-seq and snATAC-seq for the retina from healthy donors. To further increase the size of the dataset, we then collected and incorporated publicly available datasets. All data underwent a unified preprocessing pipeline and data integration. Multiple integration methods were benchmarked by scIB, and scVI was chosen. To harness the power of multiomics, snATAC-seq datasets were also preprocessed, and scGlue was used to generate co-embeddings between snRNA-seq and snATAC-seq cells. To facilitate the public use of references, we employ CELLxGENE and UCSC Cell Browser for visualization. By combining previously published and newly generated datasets, a single cell atlas of the human retina that is composed of 2.5 million single cells from 48 donors has been generated. As a result, over 90 distinct cell types are identified based on the transcriptomics profile with the rarest cell type accounting for about 0.01% of the cell population. In addition, open chromatin profiling has been generated for over 400K nuclei via single nuclei ATAC-seq, allowing systematic characterization of cis-regulatory elements for individual cell type. Integrative analysis reveals intriguing differences in the transcriptome, chromatin landscape, and gene regulatory network among cell class, subgroup, and type. In addition, changes in cell proportion, gene expression and chromatin openness have been observed between different gender and over age. Accessible through interactive browsers, this study represents the most comprehensive reference cell atlas of the human retina to date. As part of the human cell atlas project, this resource lays the foundation for further research in understanding retina biology and diseases.
Datasets (6)
snRNA-seq of human retina - all cells3,177,310 cells
snRNA-seq of human retina - retinal ganglion cell subset399,605 cells
snRNA-seq of human retina - bipolar cell subset691,008 cells
snRNA-seq of human retina - amacrine cell subset571,579 cells
Single cell atlas of the human retina - Bipolar cell subclass - scRNA-seq72,788 cells
Single cell atlas of the human retina - all cells - scRNA-seq265,767 cells
Tonsils are constantly exposed to antigens through the upper respiratory tract, making them a valuable secondary lymphoid organ (SLO) for studying the interaction between innate and adaptive immune cells during germinal center (GC) development, which is crucial for building adaptive immunity. This reference includes 377,963 cells (10x Genomics 3' v3) from 17 healthy human donors spanning various age groups, including children, young adults, and older adults.
The annotation provided in this first version comprises 42 categories that provide a stable categories to classify single-cell transcriptomes of SLOs, useful for tools like Azimuth (see external URLs). In the next version, we will add a more detailed classification, encompassing all cell types and states identified in the tonsil atlas. A validation cohort was included to confirm the presence and accuracy of each annotation this latter level, using criteria such as cell neighborhood preservation, conservation of bona fide marker genes, and annotation confidence derived from the KNN classifier. For a full description and interpretation of this validation cohort, please refer to the final section of the manuscript.
The tonsil atlas is a FAIR resource—findable, accessible, interoperable, and reusable. The raw data has been deposited in ArrayExpress and can be remapped and reanalyzed by following the instructions provided in the TonsilAtlas and TonsilAtlasCAP GitHub repositories. The resulting expression matrices and Seurat objects are available on Zenodo, and the data can be accessed, explored, interpreted, and reused via the HCATonsilAtlas Bioconductor package and the Azimuth web interface. Additionally, we provide a detailed glossary of the marker genes, rationales, and publications used to annotate each cell type and state. The external URLs section contains links to all these resources.
Datasets (1)
An Atlas of Cells in the Human Tonsil377,963 cells
The term fibroblast encompasses different stromal cell subpopulations. Single-cell technologies now enable granular definition of cell states, and shared fibroblast states across different human tissues in health and disease have been reported. However, a comprehensive analysis of fibroblast states across different disease contexts in a single organ has not been performed.
Here we utilise single-cell RNA sequencing from >300 000 human fibroblast cells in skin from health and 23 diseases to define fibroblast subpopulations. We identified 6 fibroblast subclusters in health and resolved their location in human skin tissue at single-cell resolution. In disease, we identified the presence of 3 novel states (F6: Myofibroblast, F6: Myofibroblast chemokine-high, and F7: Myofibroblast fascia-like) and polarisation of two cell states observed in healthy skin from disease (F1: Superficial regenerative and F3 CCL19+ Immune-interacting ADAMDEC1+/CXCL9+).
Datasets (1)
Skin fibroblasts - Pan-disease fibroblast atlas in skin fibrosis, inflammation, and cancer337,376 cells
Crohn’s disease is an inflammatory bowel disease (IBD) commonly treated through anti-TNF blockade. However, most patients still relapse and inevitably progress. Comprehensive single-cell RNA-sequencing (scRNA-seq) atlases have largely sampled patients with established treatment-refractory IBD, limiting our understanding of which cell types, subsets, and states at diagnosis anticipate disease severity and response to treatment. Here, through combining clinical, flow cytometry, histology, and scRNA-seq methods, we profile diagnostic human biopsies from the terminal ileum of treatment-naïve pediatric patients with Crohn’s disease (pediCD; n=14), matched repeat biopsies (pediCD-treated; n=8) and from non-inflamed pediatric controls with functional gastrointestinal disorders (FGID; n=13). To resolve and annotate epithelial, stromal, and immune cell states among the 201,883 baseline single-cell transcriptomes, we develop a principled and unbiased tiered clustering approach, ARBOL. Through flow cytometry and scRNA-seq, we observe that treatment-naïve pediCD and FGID have similar broad cell type composition. However, through high-resolution scRNA-seq analysis and microscopy, we identify significant differences in cell subsets and states that arise during pediCD relative to FGID. By closely linking our scRNA-seq analysis with clinical meta-data, we resolve a vector of T cell, innate lymphocyte, myeloid, and epithelial cell states in treatment-naïve pediCD (pediCD-TIME) samples which can distinguish patients along the trajectory of disease severity and anti-TNF response. By using ARBOL with integration, we position repeat on-treatment biopsies from our patients between treatment-naïve pediCD and on-treatment adult CD. We identify that anti-TNF treatment pushes the pediatric cellular ecosystem towards an adult, more treatment-refractory state. Our study jointly leverages a treatment-naïve cohort, high-resolution principled scRNA-seq data analysis, and clinical outcomes to understand which baseline cell states may predict Crohn’s disease trajectory.
Datasets (1)
Concerted changes in the pediatric single-cell intestinal ecosystem before and after anti-TNF blockade197,281 cells
The human immune system displays remarkable variation between individuals, leading to differences in susceptibility to autoimmune disease. We present single cell RNA sequence data from 1,267,768 peripheral blood mononuclear cells from 982 healthy human subjects. For 14 cell types, we identified 26,597 independent cis-expression quantitative trait loci (eQTLs) and 990 trans-eQTL, with the majority showing cell type specific effects on gene expression. We subsequently show how eQTLs have dynamic allelic effects in B cells transitioning from naïve to memory states and demonstrate how commonly segregating alleles lead to inter-individual variation in immune function. Finally, utilizing a Mendelian randomization approach, we identify the causal route by which 305 risk loci contribute to autoimmune disease at the cellular level. This work brings together genetic epidemiology with scRNA-seq to uncover drivers of inter-individual variation in the immune system.
Datasets (1)
Single-cell eQTL mapping identifies cell type specific genetic control of autoimmune disease1,248,980 cells
Lack of diversity and proportionate representation in genomics datasets and databases contributes to inequity in healthcare outcomes globally. The relationships of human diversity with biological and biomedical phenotypes are pervasive, yet remain understudied, particularly in a single-cell genomics context. Here we present the Asian Immune Diversity Atlas (AIDA), a multi-national single-cell RNA-sequencing (scRNA-seq) healthy reference atlas of human immune cells. AIDA comprises 1,265,624 circulating immune cells from 619 healthy donors and 6 controls, spanning 7 population groups across 5 countries. AIDA is one of the largest healthy blood datasets in terms of number of cells, and also the most diverse in terms of number of population groups. Though population groups are frequently compared at the continental level, we identified a pervasive impact of sub-continental diversity on cellular and molecular properties of immune cells. These included cell populations and genes implicated in disease risk and pathogenesis as well as those relevant for diagnostics. We identified numerous examples where the effects of age and sex were modulated by self-reported ethnicity. We also detected age, sex, and self-reported ethnicity differences at the resolution of cell neighbourhoods, highlighting finer-grained distinctions than were apparent at cell-type level. We discovered functional genetic variants influencing cell type-specific gene expression, including context-dependent effects, which were under-represented in analyses of non-Asian population groups, and which helped contextualise disease-associated variants. We validated our findings using multiple independent datasets and cohorts. AIDA provides fundamental insights into the relationships of human diversity with immune cell phenotypes, enables analyses of multi-ancestry disease datasets, and facilitates the development of precision medicine efforts in Asia and beyond.
Datasets (1)
AIDA Phase 1 Data Freeze v2: Chinese, Indian, Japanese, Korean, Malay, and Thai donors in Japan, Singapore, South Korea, Thailand, and India1,265,624 cells
Analysis of human blood immune cells provides insights into the coordinated response to viral infections such as severe acute respiratory syndrome coronavirus 2, which causes coronavirus disease 2019 (COVID-19). We performed single-cell transcriptome, surface proteome and T and B lymphocyte antigen receptor analyses of over 780,000 peripheral blood mononuclear cells from a cross-sectional cohort of 130 patients with varying severities of COVID-19. We identified expansion of nonclassical monocytes expressing complement transcripts (CD16+C1QA/B/C+) that sequester platelets and were predicted to replenish the alveolar macrophage pool in COVID-19. Early, uncommitted CD34+ hematopoietic stem/progenitor cells were primed toward megakaryopoiesis, accompanied by expanded megakaryocyte-committed progenitors and increased platelet activation. Clonally expanded CD8+ T cells and an increased ratio of CD8+ effector T cells to effector memory T cells characterized severe disease, while circulating follicular helper T cells accompanied mild disease. We observed a relative loss of IgA2 in symptomatic disease despite an overall expansion of plasmablasts and plasma cells. Our study highlights the coordinated immune response that contributes to COVID-19 pathogenesis and reveals discrete cellular components that can be targeted for therapy.
Datasets (1)
Single-cell multi-omics analysis of the immune response in COVID-19647,366 cells
The function of a cell is defined by its intrinsic characteristics and its niche: the tissue microenvironment in which it dwells. Here we combine single-cell and spatial transcriptomics data to discover cellular niches within eight regions of the human heart. We map cells to microanatomical locations and integrate knowledge-based and unsupervised structural annotations. We also profile the cells of the human cardiac conduction system1. The results revealed their distinctive repertoire of ion channels, G-protein-coupled receptors (GPCRs) and regulatory networks, and implicated FOXP2 in the pacemaker phenotype. We show that the sinoatrial node is compartmentalized, with a core of pacemaker cells, fibroblasts and glial cells supporting glutamatergic signalling. Using a custom CellPhoneDB.org module, we identify trans-synaptic pacemaker cell interactions with glia. We introduce a druggable target prediction tool, drug2cell, which leverages single-cell profiles and drug–target interactions to provide mechanistic insights into the chronotropic effects of drugs, including GLP-1 analogues. In the epicardium, we show enrichment of both IgG+ and IgA+ plasma cells forming immune niches that may contribute to infection defence. Overall, we provide new clarity to cardiac electro-anatomy and immunology, and our suite of computational approaches can be applied to other tissues and organs.
Datasets (1)
Combined single cell and single nuclei RNA-Seq data - Heart Global704,296 cells
Treatment of severe COVID-19 is currently limited by clinical heterogeneity and incomplete understanding of potentially druggable immune mediators of disease. To advance this, we present a comprehensive multi-omic blood atlas in patients with varying COVID-19 severity and compare with influenza, sepsis and healthy volunteers. We identify immune signatures and correlates of host response. Hallmarks of disease severity revealed cells, their inflammatory mediators and networks as potential therapeutic targets, including progenitor cells and specific myeloid and lymphocyte subsets, features of the immune repertoire, acute phase response, metabolism and coagulation. Persisting immune activation involving AP-1/p38MAPK was a specific feature of COVID-19. The plasma proteome enabled sub-phenotyping into patient clusters, predictive of severity and outcome. Tensor and matrix decomposition of the overall dataset revealed feature groupings linked with disease severity and specificity. Our systems-based integrative approach and blood atlas will inform future drug development, clinical trial design and personalised medicine approaches for COVID-19. The complete raw and processed CITE-seq datasets are available at the European Genome-phenome Archive (EGA) and Zenodo respectively. Here a more limited version of the gene expression data is presented for the purpose of online visualisation and exploration of the dataset. Please note that features have been automatically filtered for compatibility with the Cellxgene Data Portal (ADT features have been removed). For further analysis it is recommended to use the unfiltered datasets from the EGA or Zenodo (where processed datasets are also available in anndata format).
Datasets (1)
COMBAT project: single cell gene expression data from COVID-19, sepsis and flu patient PBMCs836,148 cells
The human brain is subdivided into distinct anatomical structures, including the neocortex, which in turn encompasses dozens of distinct specialized cortical areas. Early morphogenetic gradients are known to establish early brain regions and cortical areas, but how early patterns result in finer and more discrete spatial differences remains poorly understood1. Here we use single-cell RNA sequencing to profile ten major brain structures and six neocortical areas during peak neurogenesis and early gliogenesis. Within the neocortex, we find that early in the second trimester, a large number of genes are differentially expressed across distinct cortical areas in all cell types, including radial glia, the neural progenitors of the cortex. However, the abundance of areal transcriptomic signatures increases as radial glia differentiate into intermediate progenitor cells and ultimately give rise to excitatory neurons. Using an automated, multiplexed single-molecule fluorescent in situ hybridization approach, we find that laminar gene-expression patterns are highly dynamic across cortical regions. Together, our data suggest that early cortical areal patterning is defined by strong, mutually exclusive frontal and occipital gene-expression signatures, with resulting gradients giving rise to the specification of areas between these two poles throughout successive developmental timepoints.
Datasets (1)
Second Trimester Human Developing Brain Regions and Cortical Areas457,965 cells
It is not fully understood why COVID-19 is typically milder in children. Here, to examine the differences between children and adults in their response to SARS-CoV-2 infection, we analysed paediatric and adult patients with COVID-19 as well as healthy control individuals (total n = 93) using single-cell multi-omic profiling of matched nasal, tracheal, bronchial and blood samples. In the airways of healthy paediatric individuals, we observed cells that were already in an interferon-activated state, which after SARS-CoV-2 infection was further induced especially in airway immune cells. We postulate that higher paediatric innate interferon responses restrict viral replication and disease progression. The systemic response in children was characterized by increases in naive lymphocytes and a depletion of natural killer cells, whereas, in adults, cytotoxic T cells and interferon-stimulated subpopulations were significantly increased. We provide evidence that dendritic cells initiate interferon signalling in early infection, and identify epithelial cell states associated with COVID-19 and age. Our matching nasal and blood data show a strong interferon response in the airways with the induction of systemic interferon-stimulated populations, which were substantially reduced in paediatric patients. Together, we provide several mechanisms that explain the milder clinical syndrome observed in children.
Datasets (1)
Local and systemic responses to SARS-CoV-2 infection in children and adults - PBMCs 422,220 cells
It is not fully understood why COVID-19 is typically milder in children. Here, to examine the differences between children and adults in their response to SARS-CoV-2 infection, we analysed paediatric and adult patients with COVID-19 as well as healthy control individuals (total n = 93) using single-cell multi-omic profiling of matched nasal, tracheal, bronchial and blood samples. In the airways of healthy paediatric individuals, we observed cells that were already in an interferon-activated state, which after SARS-CoV-2 infection was further induced especially in airway immune cells. We postulate that higher paediatric innate interferon responses restrict viral replication and disease progression. The systemic response in children was characterized by increases in naive lymphocytes and a depletion of natural killer cells, whereas, in adults, cytotoxic T cells and interferon-stimulated subpopulations were significantly increased. We provide evidence that dendritic cells initiate interferon signalling in early infection, and identify epithelial cell states associated with COVID-19 and age. Our matching nasal and blood data show a strong interferon response in the airways with the induction of systemic interferon-stimulated populations, which were substantially reduced in paediatric patients. Together, we provide several mechanisms that explain the milder clinical syndrome observed in children.
Datasets (1)
Local and systemic responses to SARS-CoV-2 infection in children and adults - PBMCs 422,220 cells
Systemic lupus erythematosus (SLE) is a heterogeneous autoimmune disease. Knowledge of circulating immune cell types and states associated with SLE remains incomplete. We profiled over 1.2 million PBMCs (162 cases, 99 controls) with multiplexed single-cell RNA-sequencing (mux-seq). Cases exhibited elevated expression of type-1 interferon-stimulated genes (ISG) in monocytes, reduction of naïve CD4+ T cells that correlated with monocyte ISG expression, and expansion of repertoire-restricted cytotoxic GZMH+ CD8+ T cells. Cell-type-specific expression features predicted case-control status and stratified patients into two molecular subtypes. We integrated dense genotyping data to map cell-type-specific cis-eQTLs and link SLE-associated variants to cell-type-specific expression. These results demonstrate mux-seq as a systematic approach to characterize cellular composition, identify transcriptional signatures, and annotate genetic variants associated with SLE.
Datasets (1)
Multiplexed scRNA-seq of 1.2 million PBMCs from adult lupus samples1,263,676 cells
Systemic lupus erythematosus (SLE) is a heterogeneous autoimmune disease. Knowledge of circulating immune cell types and states associated with SLE remains incomplete. We profiled over 1.2 million PBMCs (162 cases, 99 controls) with multiplexed single-cell RNA-sequencing (mux-seq). Cases exhibited elevated expression of type-1 interferon-stimulated genes (ISG) in monocytes, reduction of naïve CD4+ T cells that correlated with monocyte ISG expression, and expansion of repertoire-restricted cytotoxic GZMH+ CD8+ T cells. Cell-type-specific expression features predicted case-control status and stratified patients into two molecular subtypes. We integrated dense genotyping data to map cell-type-specific cis-eQTLs and link SLE-associated variants to cell-type-specific expression. These results demonstrate mux-seq as a systematic approach to characterize cellular composition, identify transcriptional signatures, and annotate genetic variants associated with SLE.
Datasets (1)
Multiplexed scRNA-seq of 1.2 million PBMCs from adult lupus samples1,263,676 cells
Organs are composed of diverse cell types that traverse transient states during organogenesis. To interrogate this diversity during human development, we generate a single-cell transcriptome atlas from multiple developing endodermal organs of the respiratory and gastrointestinal tract. We illuminate cell states, transcription factors, and organ-specific epithelial stem cell and mesenchyme interactions across lineages. We implement the atlas as a high-dimensional search space to benchmark human pluripotent stem cell (hPSC)-derived intestinal organoids (HIOs) under multiple culture conditions. We show that HIOs recapitulate reference cell states and use HIOs to reconstruct the molecular dynamics of intestinal epithelium and mesenchyme emergence. We show that the mesenchyme-derived niche cue NRG1 enhances intestinal stem cell maturation in vitro and that the homeobox transcription factor CDX2 is required for regionalization of intestinal epithelium and mesenchyme in humans. This work combines cell atlases and organoid technologies to understand how human organ development is orchestrated.
Datasets (2)
Charting human development using a multi-endodermal organ atlas and organoid models - Adult duodenum5,200 cells
Charting human development using a multi-endodermal organ atlas and organoid models - Developing Human Atlas155,232 cells
Tumor cells may share some patterns of gene expression with their cell of origin, providing clues into the differentiation state and origin of cancer. Here, we study the differentiation state and cellular origin of 1300 childhood and adult kidney tumors. Using single cell mRNA reference maps of normal tissues, we quantify reference “cellular signals” in each tumor. Quantifying global differentiation, we find that childhood tumors exhibit fetal cellular signals, replacing the presumption of “fetalness” with a quantitative measure of immaturity. By contrast, in adult cancers our assessment refutes the suggestion of dedifferentiation towards a fetal state in most cases. We find an intimate connection between developmental mesenchymal populations and childhood renal tumors. We demonstrate the diagnostic potential of our approach with a case study of a cryptic renal tumor. Our findings provide a cellular definition of human renal tumors through an approach that is broadly applicable to human cancer.
Datasets (2)
Single cell derived mRNA signals across human kidney tumors - Wilms tumor cells 4,636 cells
Single cell derived mRNA signals across human kidney tumors - fetal kidney cells27,203 cells
A comprehensive cellular anatomy of normal human prostate is essential for solving the cellular origins of benign prostatic hyperplasia and prostate cancer. The tools used to analyze the contribution of individual cell types are not robust. We provide a cellular atlas of the young adult human prostate and prostatic urethra using an iterative process of single-cell RNA sequencing (scRNA-seq) and flow cytometry on ∼98,000 cells taken from different anatomical regions. Immunohistochemistry with newly derived cell type-specific markers revealed the distribution of each epithelial and stromal cell type on whole mounts, revising our understanding of zonal anatomy. Based on discovered cell surface markers, flow cytometry antibody panels were designed to improve the purification of each cell type, with each gate confirmed by scRNA-seq. The molecular classification, anatomical distribution, and purification tools for each cell type in the human prostate create a powerful resource for experimental design in human prostate disease.
Datasets (3)
A Cellular Anatomy of the Normal Adult Human Prostate and Prostatic Urethra - Human Epithelial Cells24,544 cells
A Cellular Anatomy of the Normal Adult Human Prostate and Prostatic Urethra - Human Fibromuscular Stromal Cells2,113 cells
A Cellular Anatomy of the Normal Adult Human Prostate and Prostatic Urethra - All Human Cells28,702 cells