7th International Congress on Human-Computer Interaction, Optimization and Robotic Applications, ICHORA 2025, Ankara, Türkiye, 23 - 24 Mayıs 2025, (Tam Metin Bildiri)
With recent technological developments, object detection has become an important component of many computer vision applications, such as autonomous driving, image recognition, unmanned surveillance, healthcare, and other industrial applications. This study compares the latest iterations of the YOLO family (small, YOLOv8, YOLOv9, YOLOv10 and YOLO11 with medium and large variants), SSD300-VGG16 and EfficientDet (D0 to D7) object detection models in terms of Frames Per Second (FPS), Inference Time, Precision, Recall, F1-Score, Mean Average Precision (mAP) and Intersection over Union (IoU) metrics on the MS COCO 2017 dataset. The PyTorch framework is used to test each model on identical hardware and software setups. The results show that the best choices for real-time applications are the YOLO models, especially the Yolov8s and Yolov10s models. The SSD300, on the other hand, performs admirably in terms of speed, but lags far behind its competitors in terms of accuracy metrics. However, the EfficientDet models offer a balance between accuracy and speed, making them suitable for applications where a balance between the two is required. This comprehensive comparative analysis identifies the advantages and disadvantages of different object detection models, allowing researchers and practitioners to select the optimum model based on application-specific requirements. Considering the trade-off between real-time applications and scenarios requiring high accuracy, the study provides insights to support the development of object identification applications in both academic and industrial fields.