These are indeed bona fide late (terminally differentiated) enterocytes due to expression of genes from the APO family (PMID: 31753849).
Explore other scientists feedback or go to the Molecular Data page and select a cell label to provide your feedback.
These are indeed bona fide late (terminally differentiated) enterocytes due to expression of genes from the APO family (PMID: 31753849).
Rather than "early," we suggest categorizing by position along the crypt-villus axis. Based on ANPEP expression, we propose the designation "low-to-mid-villus enterocyte." (PMID 28798045)
For a matter of consistency, if there is interest to differentiate immature Goblet Cells from fully mature Goblet Cells, this cluster could be split. Cluster 6 contains Mature Goblet Cells expressing the marker TFF1. Cluster 11 lacks this marker.
consider sub-clustering them by FER1L6 (left) / SELENOM (right) expressions
It is unclear why transcriptionally distant clusters were uniformly annotated as “Secretory Goblet cells,” particularly given the large size of this population (~34,000 cells). These clusters segregate by tissue of origin, with clusters 6 and 11 derived predominantly from the ileum and cluster 12 from the colon/rectum. Differential expression analysis between clusters 6 and 12 reveals consistent region-specific transcriptional programs: cluster 12 is enriched for WFDC2, MUC5B, RARRES2, and RETNLB (PMID: 35176508; PMID: 30814735), whereas REG4 and SPINK4 are enriched in cluster 6. These differences suggest stable regional specialization within the goblet cell lineage rather than heterogeneity within a single pooled population. Analogous to the established distinction between enterocytes and colonocytes in absorptive lineages, these data support consideration of region-specific annotation of secretory populations (e.g., small intestinal versus colonic goblet cells). Further assessment of secretory lineage regulators (e.g., ATOH1, HES4) and cell-cycle state would help clarify whether these clusters represent regionally specified goblet subtypes rather than differences driven by differentiation or proliferative status
why there are two clusters with the same annotation and very far from each other?
We recommend merging both goblet cell subclusters. While studies suggest functional heterogeneity within the goblet cell population, it is unclear whether these subpopulations can be distinguished at the transcript level. We also recommend removing the "secretory" prefix — the accepted term is simply "goblet cells."
Considering the previous feedback, the source of the tissue for Goblet Cells is important (splitting between ileum and colon) but from our data, expression of genes such as ZG16 confirm the Goblet Cell phenotype.
This cluster include goblet cells from small intestine and large intestine (split in the UMAP) and mature goblet cells, with high expression of TFF1+. I agree with Ángela Sanzo and I would also split clusters 6 and 11, but also consider splitting cluster 12 (from Leiden lineage 1), and indicate the origin of these cells (small intestine or large intestine)
In our experience, but also seen in other papers (for instance Kong et al), the epithelium is separated by tissue origin (ileum vs colon), hence I would consider this for the analysis. Given that, i think the secretory goblet cells from the ileum in the leiden clustering separates in cluster 6, which might be too mature goblet cells, expression of TFF1 and FERiL6) and cluster 11 would then be the secretory goblet cells found in the ileum
BEST2, KLK3/15, MUC1/2+ --> Secretory goblet cells. See e.g. here: https://www.nature.com/articles/s41586-021-03852-1?utm_source=chatgpt.com
within enterocytes, there is a subpopulation highly expressing MUC2, CLCA1, SPINK4 - consider sub-clustering these and explore other marker genes
Small suggestion: for a matter of consistency in relation to the other enterocyte annotations, I'd re-annotate the cluster "Enterocyte" as "Enterocyte intermediate".
APOB, APOA4, APOC3, SLC* genes suggest enterocyte lineage (Glinka, Wickham and Nadalin et al., 2026, Journal of Pathology)
the sub-cluster closer to secretory goblet cells shows high KCNMA1, FCGBP, and FER1L6 expression, a signature shared with some goblet cells; probably better to split.
The annotation "Enterocytes Early" seems appropriate. We have a similar population enriched for GSTA1 and GSTA2 in our duodenal dataset and we attributed a similar annotation to them.
generally agree, but the cells very close to the secretory goblet cells that still belong to the "early enterocytes" cluster also seem to be quite high for goblet cell markers. MUC2hi, SPINK4, AGR2... Split cluster?
While the “Absorptive TA” designation is conceptually appealing, the size of this population is likely overestimated in the current annotation. Transit-amplifying (TA) cells, defined by high proliferative signatures (e.g., MKI67, PCNA, TOP2A) are largely confined to clusters 9 and 13 at Leiden_Lineage_L1 resolution, with some contributions from clusters 0 and 1. At Leiden_Lineage_L2 resolution, the following subclusters align with TA/Absorptive TA identity: Subcluster 1_0: REG1A, OLFM4, DUT, PCNA (proliferative progenitor signature). Subcluster 1_3: REG1A, OLFM4, MKI67 (proliferative progenitor signature). Subcluster 0_2 (colorectum): LEFTY1, DUT, TUBB, PCNA. Subclusters that should not be annotated as TA/Absorptive TA Subcluster 1_2: Expresses progenitor markers (OLFM4, ADH1C) and early absorptive markers (GSTA1), but lacks canonical proliferation markers (MKI67, PCNA, TOP2A). Since TA cells are highly proliferative by definition, and 1_2 seems to precede Enterocyte Early, this subcluster (along with 1_5 (REG1A, OLFM4, LCN2)) could be annotated as Enterocyte Progenitors (noting that "progenitor" is less strict than TA, as it does not require the same high level of proliferation). Subcluster 0_0: these cells are halfway-differentiated colonocytes (CA1, CA2). My suggestion would be to merge them with with the Colonocyte cluster and annotate as Colonocyte Intermediate (per prior feedback).
We agree that there is evidence supporting both TA and absorptive lineage identity. Any cluster designated as TA should express cell cycle genes (e.g., MKI67, CDK1), and these should be included in the rationale for naming. Given that the secretory TA designation is itself under debate, designating this cluster as absorptive TA also warrants careful consideration.
there is a small cluster within secretory goblet cells and colonocytes highly expressing DUOX2; this population consistently appears and likely plays a role in innate immune defense against pathogens, which would be worth further exploration.
This annotation, as "Colonocyte late," is appropriate since this cluster is enriched for AQP8. I have Visium and Xenium data demonstrating this population is located at the top of colonic crypts. For reference: https://doi.org/10.21203/rs.3.rs-7535904/v1
We agree with designating late differentiation; we suggest adding AQP8 to the cluster name (PMID 41115527).
High expression of AQP8, indicating top-crypt location and late differentiation
As also noted by Tatiana, these two clusters should be merged. By definition, all TA cells should display markers of proliferation (e.g. MKI67, TOP2A, PCNA).
We recommend merging with the general TA cluster. By definition, all TA cells are cycling; separating these populations into two clusters is not functionally meaningful.
In our dataset (PMID 41115527), LEFTY1 was associated with an early progenitor population. There is no functional reason to designate these cells as colonocytes; we recommend merging this cluster with the progenitor population.
As also noted by Tatiana, these two clusters should be merged. By definition, all TA cells should display markers of proliferation (e.g. MKI67, TOP2A, PCNA).
We recommend merging with the general TA cluster. By definition, all TA cells are cycling; separating these populations into two clusters is not functionally meaningful.
These are indeed intestinal stem cells (i.e. enrichment for LGR5, SMOC2 and ASCL2). However, as recommended by others, the annotation should be refined to Intestinal Stem Cells instead of Epithelial Stem Cells (LGR5+).
We agree with this designation and recommend re-labeling as "intestinal stem cells" (the canonical term)
Agree with changing the name to Intestinal Stem Cells. In addition, this cluster is clearly separated in two in the UMAP: small intestinal ISC and colonic ISC. Might it be worth splitting this cluster in two?
Usual nomenclature is Intestinal Stem Cells (ISCs) rather than epithelial. Also I think can remove the "(LGR5+)" as there is not key marker genes in most other cell types
Clear! LGR5+ OLFM5, SMOC2...
There is a small subcluster (24_4) of cells from several places in the gut. The duodenum cells and strangely some of the ileum cells show expression of Brunner's gland markers including MUC6, TFF2, PGC, AQP5, and some TFF3. The caecum cells from the same subcluster show lower expression of these markers in addition to Goblet markers such as MUC2. A fraction of the ileum 24_4 cells actually show high expression of the same Brunner's gland markers. Could this be an un-described glandular cell in the ileum?
The “Secretory TA” annotation is not broadly adopted in the literature, but these cells are clearly engaged with the secretory lineage (e.g., HES6, TFF3, etc), while also expressing canonical proliferation markers (MKI67, TOP2A, PCNA) typical of transit-amplifying cells. Although non-standard, this annotation could be arbitrarily employed here to capture their dual proliferative and secretory commitment.
A better designation is needed based on current understanding of secretory differentiation. There are no data to support that secretory progenitors undergo transit amplification. We suggest relabeling this cluster as "secretory progenitors."
These cells express some colonocyte markers such as CA2 and CA1, but still lacks AQP8 (colonocyte late marker). In my opinion, these cells are in a state between Colonocytes Early and Late. Therefore, for consistency, they could be annotated as Colonocytes Intermediate.
These cells are colonocytes but lack additional distinguishing features. This cluster appears "smeared" over the UMAP, and a fraction may correspond to what we refer to as FABP1+ colonocytes (PMID 41115527).
Small suggestion; instead of Colonocytes BEST4, the most used annotation is "BEST4+ Colonocytes".
We agree with re-naming to BEST4+ colonocytes
These are well-known BEST4+ colonocytes (BEST4, OTOP2) I would just rename them to have the gene before the name.
As mentioned by others, the term secretory here is redundant. The annotation could be simplified to "Tuft Cells".
We agree with this designation and recommend removing the "secretory" prefix.
Tuft cells are secretory per definition. I would just annotate them as "Tuft cells" instead of "Secretory Tuft cells". The same would apply for Goblet cells or Paneth cells.
Expression of TRPM5, SH2D6.... (Garrido-Trigo A)
We recommend removing this cluster. These cells should be excluded during QC prior to publication.
The cells are definitely expressing epithelial gene markers (e.g. MUC) but the clustering is strange.
I think there are some epithelial cells (positive for EPCAM, MUC2...), but part of the cluster are mesenchymal cells and hence the markers that need to be filter out in the QC
These cells are a subset of Colonocytes Late, since they are enriched for AQP8 (PMID: 35176508). I'd re-annotate them as LAMA3+ Colonocytes, if you'd like to highlight enrichment for this laminin subunit. Alternatively, they could be merged with Colonocytes Late.
When defining enterocytes along their pseudotime trajectory from stem cells to differentiated states, we identified LAMA3, LAMC2, and ARL14 as being expressed in relatively less differentiated enterocytes.
Laminin is not typically expressed by colonocytes unless there is underlying pathology. For the annotation phase of a healthy gut atlas, this cluster should not be included as a distinct cell type; it likely represents contamination from IBD or cancer datasets and should be excluded from initial clustering.
Markers LAMA3 and EPHA2 (Garrido-Trigo A)
Small suggestion; instead of Enterocytes BEST4, the most used annotation is "BEST4+ Enterocytes".
These are BEST4+ enterocytes (BEST4, OTOP2) I would just reverse name order for consistency (Hickey, J.W et al)
This annotation requires review. Follicle-associated epithelial (FAE) cells is a term often used interchangeably with microfold (M) cells, which this cluster does not represent. Instead, these cells mark an early stage of BEST4+ colonocytes, as evidenced by highly variable genes including BEST4, OTOP2, SPIB, LYZ, HES4, CA7, and NOTCH2. Stem/pluripotency-associated markers LEFTY1 and OLFM4 suggest an early differentiation stage. For consistency with other annotations, I’d recommend these cells are annotated as "BEST4+ Colonocytes Early". This subset aligns with descriptions in (PMID: 30814735, PMID: 30814735, PMID: 35176508).
I am uncertain about this annotation because the canonical marker genes used CCL20, GP2, and RANK are the same as those typically used to define microfold (M) cells. Furthermore, the additional marker evidence, including CCL20, IL15, CCL25, and ANPEP, is insufficient to confidently label these cells as follicle-associated (FA) enterocytes, as these genes generally mark immune-adapted epithelial cells within the FA epithelium, which includes M cells. Differential expression analysis comparing these FA enterocytes to background cells shows strong enrichment of SPIB (Log2FC = 5.4), further supporting the interpretation that these cells represent either FA epithelium or early/differentiating M-cell precursors. This interpretation is reinforced by the expression of HES4, a Notch target gene indicative of a transitional or immature epithelial state, consistent with early M-cell. Additionally, examining the cell-cycle states of these FA enterocytes and M cells could provide further evidence to distinguish whether these cells are indeed in an early M-cell state
What are these cells? Why are they called doublets?? They express a lot of markers and it's not clear what type of cells are
I strongly recommend splitting this apparent doublet cluster into two distinct populations. The presence of canonical Paneth cell markers (e.g., DEFA5, DEFA6, PRSS2; PMID: 35176508) clearly indicates a bona fide Paneth cell population. It seems that Paneth cells within this "doublet" cluster represents the most mature state of this population. They should be then merged with the current “Secretory Paneth cells” cluster. It would be unfortunate to overlook this subset in the Gut Cell Atlas. Additionally, the cluster “Secretory Paneth cells” should be renamed as “Paneth cells”, as secretory in this context is redundant. The remaining cells within this doublet cluster exhibit a clear B cell signature, characterized by expression of MS4A1 (CD20), CD37, and additional lymphoid markers. These B cells are potentially from the intraepithelial lymphocyte (IEL) compartment. The B cells should be excluded from the epithelial lineage.
We recommend removing this cluster. Doublets are a technical artifact and should be computationally excluded during quality control rather than retained as a distinct cluster in a published cell atlas.
This should be removed as part of QC as it has the appearance of doublets - expressing epithelial, immune and EEC genes
Mixed cell profile. Some epithelial cells some B cells?
These are indeed Enterochromaffin cells. However, I'd suggest a small simplification of the annotation: Instead of EEC Enterochromaffin Cells, either Enterochromaffin Cells or its abbreviation (ECC) would suffice.
I think these are different cell types as they should cluster closely to the the Goblet cells cluster
We recommend merging both goblet cell subclusters. While studies suggest functional heterogeneity within the goblet cell population, it is unclear whether these subpopulations can be distinguished at the transcript level. We also recommend removing the "secretory" prefix — the accepted term is simply "goblet cells."
High expression of TFF1, indicating mature state of goblet cells. I would change "Secretory Goblet Cells Mature" to "Goblet cells mature" and indicate whether this cluster is from small intestine or large intestine samples
The current annotation of this cluster is appropriate and well supported by its marker gene signature. GP2 is a canonical marker of this population and has most recently been proposed as a marker of terminally differentiated M cells. In the same study, ICAM2 was reported as an additional and more general M cell marker (Spoelstra et al., Nature, 2026; PMID: 41372409).
TNFAIP2+, CCL23+, SOX8+, GP2+ Clear M cells. See also here as reference. https://www.nature.com/articles/s41586-025-09829-8
A minor annotation simplification could be applied: replacing “EEC N Cells” with “N-type EEC" or even "N-EEC". The "C" already stands for cells.
I agree with separating EEC N cells and Enterochromaffin cells. They correspond to cluster 21 in Leiden_Lineage_L1, but only N cells (small intestine) express NTS. This highlights a major comment in the epithelial subset: it might be helpful to split small intestine and large intestine samples
A minor annotation simplification could be applied: replacing “EEC L Cells (PYY+)” with “L-type EEC" or even "L-EEC". The "C" already stands for cells.
These are indeed bona fide Paneth Cells, as they express the canonical markers DEFA5, DEFA6. However, as mentioned by others, "Secretory" is a redundant term and should be removed.
We recommend removing the "secretory" prefix. The accepted term is simply "Paneth cells."
marker genes DEFA5, DEFA6, REG3A and PLA2G2A (Oliver AJ https://www.nature.com/articles/s41586-024-07571-1)
The current annotation of this cluster is appropriate and supported by the marker gene signature (PMID: 32407674).
These immune cells should be removed from here.
We recommend removing this cluster from the epithelial atlas. Monocytes are immune cells and do not belong in an annotation framework focused on the intestinal epithelium. They should be represented in a separate immune compartment atlas.
Agree, these are circulating neutrophils (FCGR3B, S100A8...), they may appear on sc of highly vascularized tissues of healthy donors (they only come from 'normal'). Nonetheless, they do not belong in here but in the myeloid compartment.
I think these might be more circulating neutrophils? Definitely myeloid immune cells. CSF3R, FCGR3B, S100A12, SELL... Either way, these cells should be probably deleted from an epithelial cell object.
These are indeed enteroendocrine cells (EEC). However, they do not express high levels of peptide hormones and could represent an intermediate state. They lack early EEC transcription factors (i.e. DLL1, NEUROG3), and seem to be somewhat enriched for transcription factors linked to EEC intermediate/late states (e.g. RFX3, RFX2, ETV1). They seem to express low levels of NTS, GCG and PYY, but are overall negative other enteroendocrine hormones. For reference: PMID: 30712869.
Enteroendocrine cells refers to a larger cell type resolution. This cells seem to share expression of genes such as NTS with EEC N cell cluster. I would suggest to merge.
A minor annotation simplification could be applied: replacing “EEC S Cells (SCT+)” by “S-type EEC" or even "S-EEC". The "C" already stands for cells.
These immune cells should be removed from here.
better to remove plasma cells from epithelial cluster?
Plasma cells should cluster away from the epithelial lineage. It makes more sense to me if they cluster closely to monocytes. Maybe they are not real plasma cells?? Also, if the samples are from biopsies/resections, it makes sense to still have some immune cells (the biopsies can still be "contaminated by these cells). if they are organoids, they can be removed. especially monocytes, as there are few cells.
We recommend removing this cluster from the epithelial atlas for the same reason as monocytes. Plasma cells are immune cells and their inclusion here likely reflects contamination of the epithelial fraction. They should be represented within the appropriate immune compartment.
Agree to this label. Should be removed from this object as not an epithelial cell but contamination.