UPDF AI

Identifying inflammatory bowel disease subtypes: a comprehensive exploration of transcriptomic data and machine learning-based approaches

N. Saini,A. Acharjee

2025 · DOI: 10.1177/17562848251362391
Therapeutic Advances in Gastroenterology · 0 Citations

TLDR

These findings support a stratified approach to IBD diagnosis and therapy, enabling more personalised disease management strategies and support a stratified approach to IBD diagnosis and therapy.

Abstract

Background: Inflammatory bowel disease (IBD), encompassing Crohn’s disease (CD) and ulcerative colitis (UC), is a heterogeneous condition characterised by chronic gastrointestinal inflammation and dysregulated immune responses. Despite advances in transcriptomic analysis and machine learning (ML), consistent molecular subtyping across datasets remains a challenge. There is a critical need for robust subtypes that reflect disease heterogeneity and correlate with clinical outcomes. Objectives: Unlike prior studies focused on either UC or CD or based on small datasets, this study analyses a large-scale RNA sequencing (RNA-seq) dataset to identify transcriptomic subtypes in both UC and CD. Design: We analysed RNA-seq data from four prospective cross-sectional cohorts from Gene Expression Omnibus: GSE193677, GSE186507, GSE137344 and GSE235236. Methods: Analysed RNA-sequenced data from inflamed and non-inflamed intestinal biopsies of 2490 adult IBD patients. K-means clustering was applied independently to UC and CD samples to identify transcriptomic clusters. Gene set enrichment and network analyses explored molecular characteristics. Associations with clinical metadata, including disease severity and anatomical involvement, were assessed using Chi-square and analysis of variance tests. Results: K-means clustering revealed three distinct transcriptomic subtypes in both UC and CD. In UC, Cluster 1 was enriched for RNA processing and DNA repair genes; Cluster 2 highlighted autophagy, stress responses and upregulation of ATG13, VPS37C and DVL2; Cluster 3 emphasised cytoskeletal organisation (SRF, SRC and ABL1). In CD, Cluster 1 featured cytoskeletal remodelling and suppressed protein synthesis (CFL1, F11R and RAD23A), while Cluster 2 upregulated stress and translation pathways. Cluster 3 again prioritised cytoskeletal structure over metabolic activity. Cluster 3 in both conditions was significantly associated with moderate-to-severe endoscopic activity; Cluster 1 was enriched in inactive or mild disease. Conclusion: We report three transcriptomic subtypes in UC and CD, each with distinct molecular signatures and clinical relevance. These findings support a stratified approach to IBD diagnosis and therapy, enabling more personalised disease management strategies.

Cited Papers
Citing Papers