Deep Learning for Multi-Class Gastrointestinal Endoscopy: A Survey of Recent Advances and Reliability Challenges
DOI:
https://doi.org/10.29304/jqcsm.2026.18.22677Keywords:
Gastrointestinal endoscopy, , Deep learning, Multi-class Classification, ReliabilityAbstract
Deep learning has become a cornerstone of computer-aided diagnosis (CAD) for gastrointestinal (GI) diseases from endoscopic imagery, enabling automated recognition of complex and subtle visual patterns that often challenge clinical interpretation. While recent years have witnessed a rapid growth of deep learning-based approaches for multi-class GI disease classification, the existing literature remains highly fragmented across heterogeneous datasets, architectural choices, and evaluation protocols. This fragmentation is particularly problematic in realistic multi-class settings, where severe class imbalance, fine-grained inter-class similarity, and distributional shifts substantially undermine clinical reliability. Unlike prior surveys that primarily emphasize architectural performance or accuracy-centric comparisons, this work provides a reliability-aware analytical review of recent deep learning methods for multi-class GI disease classification from endoscopic images. We critically examine how contemporary studies address-or overlook-key reliability dimensions, including class imbalance handling, patient-level data separation, macro-level evaluation metrics, and robustness under distribution shifts. Furthermore, we identify recurring methodological limitations that may lead to hopeful performance reporting while offering limited translational value in real clinical environments. By organizing recent advances within a unified reliability-focused taxonomy, this survey highlights unresolved challenges and emerging research directions toward dependable and clinically deployable GI CAD systems. The analysis aims to support researchers and practitioners in designing evaluation pipelines and modeling strategies that move beyond optimizing accuracy toward trustworthy decision support for endoscopic diagnosis.
Downloads
References
D. Jha et al., “GastroVision: A Multi-class Endoscopy Image Dataset for Computer Aided Gastrointestinal Disease Detection,” in Machine Learning for Multimodal Healthcare Data, vol. 14315, A. K. Maier, J. A. Schnabel, P. Tiwari, and O. Stegle, Eds., in Lecture Notes in Computer Science, vol. 14315., Cham: Springer Nature Switzerland, 2024, pp. 125–140. doi: 10.1007/978-3-031-47679-2_10.
E. Ayan, “Classification of Gastrointestinal Diseases in Endoscopic Images: Comparative Analysis of Convolutional Neural Networks and Vision Transformers,” Iğdır Üniversitesi Fen Bilim. Enstitüsü Derg., vol. 14, no. 3, pp. 988–999, Sep. 2024, doi: 10.21597/jist.1501787.
H. Guo, S. A. Somayajula, R. Hosseini, and P. Xie, “Improving image classification of gastrointestinal endoscopy using curriculum self-supervised learning,” Sci. Rep., vol. 14, no. 1, p. 6100, Mar. 2024, doi: 10.1038/s41598-024-53955-8.
K. Pogorelov et al., “KVASIR: A Multi-Class Image Dataset for Computer Aided Gastrointestinal Disease Detection.” Jun. 20, 2017. doi: 10.1145/3193289.
H. Borgli et al., “HyperKvasir, a comprehensive multi-class image and video dataset for gastrointestinal endoscopy,” Sci. Data, vol. 7, no. 1, p. 283, Aug. 2020, doi: 10.1038/s41597-020-00622-y.
P. H. Smedsrud et al., “Kvasir-Capsule, a video capsule endoscopy dataset,” Sci. Data, vol. 8, no. 1, p. 142, May 2021, doi: 10.1038/s41597-021-00920-z.
H. Üzen and H. Firat, “ÖZNİTELİK ENTEGRASYONUNA DAYALI ESA MİMARİSİ KULLANILARAK ENDOSKOPİK GÖRÜNTÜLERİN SINIFLANDIRILMASI,” Kahramanmaraş Sütçü İmam Üniversitesi Mühendis. Bilim. Derg., vol. 27, no. 1, pp. 121–132, Mar. 2024, doi: 10.17780/ksujes.1362792.
A. Ali, A. Iqbal, S. Khan, N. Ahmad, and S. Shah, “A two-phase transfer learning framework for gastrointestinal diseases classification,” PeerJ Comput. Sci., vol. 10, p. e2587, Dec. 2024, doi: 10.7717/peerj-cs.2587.
S. Rubab et al., “Gastrointestinal tract disease classification from wireless capsule endoscopy images based on deep learning information fusion and Newton Raphson controlled marine predator algorithm,” Sci. Rep., vol. 15, no. 1, p. 32180, Sep. 2025, doi: 10.1038/s41598-025-17204-w.
W. Wang, X. Yang, and J. Tang, “Vision Transformer with Hybrid Shifted Windows for Gastrointestinal Endoscopy Image Classification,” IEEE Trans. Circuits Syst. Video Technol., vol. 33, no. 9, pp. 4452–4461, Sep. 2023, doi: 10.1109/TCSVT.2023.3277462.
A. Dosovitskiy et al., “An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale,” 2020, arXiv. doi: 10.48550/ARXIV.2010.11929.
H. Malik, A. Naeem, A. Sadeghi-Niaraki, R. A. Naqvi, and S.-W. Lee, “Multi-classification deep learning models for detection of ulcerative colitis, polyps, and dyed-lifted polyps using wireless capsule endoscopy images,” Complex Intell. Syst., vol. 10, no. 2, pp. 2477–2497, Apr. 2024, doi: 10.1007/s40747-023-01271-5.
V. Tanwar, B. Sharma, D. P. Yadav, and A. Mehbodniya, “Hybrid deep learning framework based on EfficientViT for classification of gastrointestinal diseases,” Sci. Rep., vol. 15, no. 1, p. 26982, Jul. 2025, doi: 10.1038/s41598-025-12128-x.
S. Tabassum et al., “GastroViT: A Vision Transformer Based Ensemble Learning Approach for Gastrointestinal Disease Classification with Grad CAM & SHAP Visualization,” 2025, arXiv. doi: 10.48550/ARXIV.2509.26502.
S. Tang et al., “Transformer-based multi-task learning for classification and segmentation of gastrointestinal tract endoscopic images,” Comput. Biol. Med., vol. 157, p. 106723, May 2023, doi: 10.1016/j.compbiomed.2023.106723.
S. Wu et al., “High-Speed and Accurate Diagnosis of Gastrointestinal Disease: Learning on Endoscopy Images Using Lightweight Transformer with Local Feature Attention,” Bioengineering, vol. 10, no. 12, p. 1416, Dec. 2023, doi: 10.3390/bioengineering10121416.
Ş. Aslan, “Ensemble-Based Deep Transfer Learning for Robust Gastrointestinal Endoscopy Image Classification,” Balk. J. Electr. Comput. Eng., vol. 13, no. 1, pp. 1–10, Mar. 2025, doi: 10.17694/bajece.1630294.
H. Gunasekaran, K. Ramalakshmi, D. K. Swaminathan, A. J, and M. Mazzara, “GIT-Net: An Ensemble Deep Learning-Based GI Tract Classification of Endoscopic Images,” Bioengineering, vol. 10, no. 7, p. 809, Jul. 2023, doi: 10.3390/bioengineering10070809.
E. Sivari, E. Bostanci, M. S. Guzel, K. Acici, T. Asuroglu, and T. Ercelebi Ayyildiz, “A New Approach for Gastrointestinal Tract Findings Detection and Classification: Deep Learning-Based Hybrid Stacking Ensemble Models,” Diagnostics, vol. 13, no. 4, p. 720, Feb. 2023, doi: 10.3390/diagnostics13040720.
C. M. Tsai and J.-D. Lee, “Dynamic Ensemble Learning with Gradient-Weighted Class Activation Mapping for Enhanced Gastrointestinal Disease Classification,” Electronics, vol. 14, no. 2, p. 305, Jan. 2025, doi: 10.3390/electronics14020305.
M. Wortsman et al., “Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time,” 2022, arXiv. doi: 10.48550/ARXIV.2203.05482.
G. M. Foody, “Challenges in the real-world use of classification accuracy metrics: From recall and precision to the Matthews correlation coefficient,” PLOS ONE, vol. 18, no. 10, p. e0291908, Oct. 2023, doi: 10.1371/journal.pone.0291908.
A. S. Sambyal, U. Niyaz, N. C. Krishnan, and D. R. Bathula, “Understanding calibration of deep neural networks for medical image classification,” Comput. Methods Programs Biomed., vol. 242, p. 107816, Dec. 2023, doi: 10.1016/j.cmpb.2023.107816.
S. Ali et al., “Assessing generalisability of deep learning-based polyp detection and segmentation methods through a computer vision challenge,” Sci. Rep., vol. 14, no. 1, p. 2032, Jan. 2024, doi: 10.1038/s41598-024-52063-x.
S. Lobanovs, J. Aleksejeva, A. K. Rūtiņa, E. Krustiņš, J. Čižovs, and D. Bļizņuks, “Machine learning in gastrointestinal endoscopy: challenges and opportunities,” BMJ Open Gastroenterol., vol. 12, no. 1, p. e001923, Oct. 2025, doi: 10.1136/bmjgast-2025-001923.
A. Sagar, “BUCAN: Bayesian Uncertainty-aware Classification with Attention Networks for Medical Images,” Nov. 06, 2025, Health Informatics. doi: 10.1101/2025.11.05.25339638.
S. Mansour et al., “multi-class gastrointestinal disease detection using context-aware deep representation learning with feature fusion approach on biomedical endoscopic images,” Eng. Appl. Artif. Intell., vol. 163, p. 113064, Jan. 2026, doi: 10.1016/j.engappai.2025.113064.
O. Attallah, M. F. Aslan, and K. Sabanci, “EndoNet: A Multiscale Deep Learning Framework for Multiple Gastrointestinal Disease Classification via Endoscopic Images,” Diagnostics, vol. 15, no. 16, p. 2009, Aug. 2025, doi: 10.3390/diagnostics15162009.
S. Bin Wahid, Z. T. Rothy, R. K. News, and S. A. Rieyan, “Interpretable Deep Learning Approaches for Reliable GI Image Classification: A Study with the HyperKvasir Dataset,” Jul. 23, 2025, Gastroenterology. doi: 10.1101/2025.07.22.25332009.
A. Chhetri, J. Korhonen, P. Gyawali, and B. Bhattarai, “NERO: Explainable Out-of-Distribution Detection with Neuron-level Relevance,” 2025, arXiv. doi: 10.48550/ARXIV.2506.15404.
Z. M. Lonseko et al., “Deep multi-task learning framework for gastrointestinal lesion-aided diagnosis and severity estimation,” Sci. Rep., vol. 15, no. 1, p. 25827, Jul. 2025, doi: 10.1038/s41598-025-09587-7.
D. Agbelese et al., “MEGAN: Mixture of Experts for Robust Uncertainty Estimation in Endoscopy Videos,” 2025, arXiv. doi: 10.48550/ARXIV.2509.12772.
M. Fahad et al., “Deep insights into gastrointestinal health: A comprehensive analysis of GastroVision dataset using convolutional neural networks and explainable AI,” Biomed. Signal Process. Control, vol. 102, p. 107260, Apr. 2025, doi: 10.1016/j.bspc.2024.107260.
S. Siddiqui, J. A. Khan, and S. Algamdi, “Deep ensemble learning for gastrointestinal diagnosis using endoscopic image classification,” PeerJ Comput. Sci., vol. 11, p. e2809, Apr. 2025, doi: 10.7717/peerj-cs.2809.
Q. He, S. Bano, D. Stoyanov, and S. Zuo, “DivGI: delve into digestive endoscopy image classification,” Int. J. Comput. Assist. Radiol. Surg., vol. 20, no. 7, pp. 1513–1520, Jun. 2025, doi: 10.1007/s11548-025-03441-x.
A. Şener and B. Ergen, “Automatic detection of gastrointestinal system abnormalities using deep learning-based segmentation and classification methods,” Health Inf. Sci. Syst., vol. 13, no. 1, p. 37, May 2025, doi: 10.1007/s13755-025-00354-6.
S. Pokhrel et al., “NCDD: Nearest Centroid Distance Deficit for Out-Of-Distribution Detection in Gastrointestinal Vision,” 2024, arXiv. doi: 10.48550/ARXIV.2412.01590.
Md. F. Ahamed et al., “Detection of various gastrointestinal tract diseases through a deep learning method with ensemble ELM and explainable AI,” Expert Syst. Appl., vol. 256, p. 124908, Dec. 2024, doi: 10.1016/j.eswa.2024.124908.
A. Quindós, P. Laiz, J. Vitrià, and S. Seguí, “Self-supervised out-of-distribution detection in wireless capsule endoscopy images,” Artif. Intell. Med., vol. 143, p. 102606, Sep. 2023, doi: 10.1016/j.artmed.2023.102606.
A. Kamble et al., “Enhanced Multi-Class Classification of Gastrointestinal Endoscopic Images with Interpretable Deep Learning Model,” 2025, arXiv. doi: 10.48550/ARXIV.2503.00780.
A. A. Shafi, M. Ahmed, M. S. Rahman, M. S. Hossain, and M. F. Uddin, “Deep Learning for Imbalanced Gastrointestinal Image Classification: A Comparative Study of Architectural Choices,” in Proceedings of the 3rd International Conference on Computing Advancements, Dhaka Bangladesh: ACM, Oct. 2024, pp. 741–746. doi: 10.1145/3723178.3723276.
Z. Ozdemir, H. Y. Keles, and O. O. Tanriover, “CLoE: Curriculum Learning on Endoscopic Images for Robust MES Classification,” 2025, arXiv. doi: 10.48550/ARXIV.2508.13280.
Y. Yang, Y. Jin, Q. Tian, Y. Yang, W. Qin, and X. Ke, “Enhancing Gastrointestinal Diagnostics with YOLO-Based Deep Learning Techniques,” Theor. Nat. Sci., vol. 95, no. 1, pp. 39–46, Feb. 2025, doi: 10.54254/2753-8818/2024.21125
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Asaad Kadhim Abd Tamimi, Zena H.Khalil, Ali Mohsin Aljuboori

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.








