A Comparative Analysis of Transformer Architectures for Automated Lung Cancer Detection in CT Images

Authors

  • Baris Okmen Department of Computer Engineering, Faculty of Engineering, Igdir Universty, 76000, Igdir, Turkey. https://orcid.org/0009-0004-2204-1927 Author
  • Yiğitcan Cakmak Department of Computer Engineering, Faculty of Engineering, Igdir Universty, 76000, Igdir, Turkey. https://orcid.org/0009-0008-7227-9182 Author
  • Ishak Pacal Department of Computer Engineering, Faculty of Engineering, Igdir Universty, 76000, Igdir, Turkey. https://orcid.org/0000-0001-6670-2169 Author

DOI:

https://doi.org/10.59543/jidmis.v3i.17135

Keywords:

Lung Cancer Classificaiton; Computer-Aided Diagnosis (CAD; Deep Learning, Transformer Architectures; Medical Image Analysis.

Abstract

The imperative for early-stage lung cancer detection is widely recognized as a critical determinant of therapeutic efficacy and patient survival. Conventional diagnostic workflows, however, are frequently constrained by their labour-intensive nature and susceptibility to interpretive inaccuracies, positioning artificial intelligence (AI) as a transformative technology in medical imaging. This research conducts a rigorous comparative analysis of four prominent vision transformer (ViT) architectures; Swin-Base, ViT-Base, DeiT-Base, and BEiT-Base evaluating their performance in the automated classification of lung cancer from computed tomography (CT) scans. The empirical validation was performed on the open-access IQ-OTH/NCCD dataset, a corpus of 1,097 images distributed across benign (n=120), malignant (n=561), and normal (n=416) classes. Model proficiency was quantified using established metrics of accuracy, precision, recall, and F1-score. The findings unequivocally establish the superiority of the Swin-Base model, which, by utilizing its innovative hierarchical design and shifted-window mechanism, attained a benchmark accuracy of 98.80% and an F1-score of 97.52%. While its counterparts achieved commendable accuracies ViT-Base (95.18%), DeiT-Base (96.39%), and BEiT-Base (95.78%) they did not match the performance of Swin-Base. Notably, this leading performance was achieved with greater computational efficiency, requiring a lower GFLOPS count than competing models.

Downloads

Published

2026-01-15

How to Cite

Baris Okmen, Yiğitcan Cakmak, & Ishak Pacal. (2026). A Comparative Analysis of Transformer Architectures for Automated Lung Cancer Detection in CT Images. Journal of Intelligent Decision Making and Information Science, 3, 528-539. https://doi.org/10.59543/jidmis.v3i.17135

Issue

Section

Articles