Physical and Engineering Sciences in Medicine, 2026 (SCI-Expanded, Scopus)
A degenerative disease of the patellofemoral joint cartilage, chondromalacia patella (CMP) often results in anterior knee discomfort and functional disability. Determining the best course of therapy and stopping the progression of the disease depend on an accurate and timely diagnosis. In this work, we provide a deep learning architecture based on transformers for the classification of chondromalacia patella using magnetic resonance imaging (MRI) data. We assessed transformer-based designs including Multi-Axis Vision Transformer (MaxViT), Vision Transformer (ViT), and Swin Transformer in addition to convolutional neural network (CNN) based models like Google Network (GoogLeNet), Residual Network 18 (ResNet18), and Mobile Network v2 (MobileNetV2). We evaluated the models’ ability to differentiate between cases of chondromalacia patella and normal cases. With an accuracy of 0.9817, precision of 0.9821, recall of 0.9817, and F1-score of 0.9818, Multi-Axis Vision Transformer outperformed all other models on the testing dataset.