A new model based on multi-axis vision transformer for chondromalacia patella diagnosis in magnetic resonance scans


Creative Commons License

Demirel S., Demirtaş O., Ordu S. K., Kazcı Ö., Akkaya H. E., YILDIZ O.

Physical and Engineering Sciences in Medicine, 2026 (SCI-Expanded, Scopus) identifier identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Basım Tarihi: 2026
  • Doi Numarası: 10.1007/s13246-026-01707-5
  • Dergi Adı: Physical and Engineering Sciences in Medicine
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Anahtar Kelimeler: Chondromalacia patella, Convolutional neural networks, Deep learning, Medical image classification, Multi-axis vision transformer, Swin transformer, Vision transformer
  • Gazi Üniversitesi Adresli: Evet

Özet

A degenerative disease of the patellofemoral joint cartilage, chondromalacia patella (CMP) often results in anterior knee discomfort and functional disability. Determining the best course of therapy and stopping the progression of the disease depend on an accurate and timely diagnosis. In this work, we provide a deep learning architecture based on transformers for the classification of chondromalacia patella using magnetic resonance imaging (MRI) data. We assessed transformer-based designs including Multi-Axis Vision Transformer (MaxViT), Vision Transformer (ViT), and Swin Transformer in addition to convolutional neural network (CNN) based models like Google Network (GoogLeNet), Residual Network 18 (ResNet18), and Mobile Network v2 (MobileNetV2). We evaluated the models’ ability to differentiate between cases of chondromalacia patella and normal cases. With an accuracy of 0.9817, precision of 0.9821, recall of 0.9817, and F1-score of 0.9818, Multi-Axis Vision Transformer outperformed all other models on the testing dataset.