A Comparative Analysis of Transformer Architectures for Automated Lung Cancer Detection in CT Images

Baris Okmen; Yiğitcan Cakmak; Ishak Pacal

Authors

Baris Okmen Department of Computer Engineering, Faculty of Engineering, Igdir Universty, 76000, Igdir, Turkey https://orcid.org/0009-0004-2204-1927
Yiğitcan Cakmak Department of Computer Engineering, Faculty of Engineering, Igdir Universty, 76000, Igdir, Turkey https://orcid.org/0009-0008-7227-9182
Ishak Pacal Department of Computer Engineering, Faculty of Engineering, Igdir Universty, 76000, Igdir, Turkey https://orcid.org/0000-0001-6670-2169

Keywords:

Lung Cancer Classificaiton; Computer-Aided Diagnosis (CAD; Deep Learning, Transformer Architectures; Medical Image Analysis

Abstract

The imperative for early-stage lung cancer detection is widely recognized as a critical determinant of therapeutic efficacy and patient survival. Conventional diagnostic workflows, however, are frequently constrained by their labour-intensive nature and susceptibility to interpretive inaccuracies, positioning artificial intelligence (AI) as a transformative technology in medical imaging. This research conducts a rigorous comparative analysis of four prominent vision transformer (ViT) architectures; Swin-Base, ViT-Base, DeiT-Base, and BEiT-Base evaluating their performance in the automated classification of lung cancer from computed tomography (CT) scans. The empirical validation was performed on the open-access IQ-OTH/NCCD dataset, a corpus of 1,097 images distributed across benign (n=120), malignant (n=561), and normal (n=416) classes. Model proficiency was quantified using established metrics of accuracy, precision, recall, and F1-score. The findings unequivocally establish the superiority of the Swin-Base model, which, by utilizing its innovative hierarchical design and shifted-window mechanism, attained a benchmark accuracy of 98.80% and an F1-score of 97.52%. While its counterparts achieved commendable accuracies ViT-Base (95.18%), DeiT-Base (96.39%), and BEiT-Base (95.78%) they did not match the performance of Swin-Base. Notably, this leading performance was achieved with greater computational efficiency, requiring a lower GFLOPS count than competing models.

DOI: https://doi.org/10.59543/jidmis.v3i.17135

A Comparative Analysis of Transformer Architectures for Automated Lung Cancer Detection in CT Images

Authors

Keywords:

Abstract

Downloads

Published

Issue

Section

License

About JIDMIS

Scopus CiteScore

SCImago

Journal Information

APC Infomration

Make a Submission

Paper Template

Indexing

The Journal is indexed or abstracted in:

Other information

Other information:

Information

Current Issue