Evaluation of Cutting-Edge Object Detection Architectures on Multi-Object and Single-Object Datasets

dc.contributor.author Parlak, Cevahir
dc.date.accessioned 2026-05-12T15:03:58Z
dc.date.available 2026-05-12T15:03:58Z
dc.date.issued 2026
dc.description.abstract This study focuses on the performance evaluation of cutting-edge object detection models, namely, YOLO12X, Mask R-CNN, RT-DETR-X, and RF-DETR-Large on the Open Images (Multi-Object) and LaSOT (Single-Object) datasets. Current cutting-edge trend applications involve CNN-based and Transformer-based object detection models. CNN-based models can use one-pass (YOLO family) or two-pass (R-CNN family) implementations. One-pass object detection models can be faster but suffer from accuracy compared to the two-pass models. Transformer-based models can use Detection Transformers or Vision Transformers. Transformer-based models are gaining popularity, and their performance surpasses CNN-based models. This study evaluates YOLO12X, Mask R-CNN from CNN-based family, and RT-DETR-X, RF-DETR-Large transformer-based architectures in terms of accuracy and time on the Open Images and the LaSOT datasets. All models are the largest available models and pretrained on COCO dataset. Transformer-based models incorporate special types of self-attention and pose significant improvement both on accuracy and speed. The experimental results demonstrate that attention and transformer-based models perform better than the traditional CNN-based object detectors and YOLO12X is the fastest method with a far margin. On the LaSOT dataset, RT-DETR-X posts 0.8804 IoU, 0.7047 F1-score, 0.6597 mAP@0.5, 28.64 fps whereas YOLO12X achieves 0.8572 IoU, 0.6657 F1-score, 0.5357 mAP@0.5, and 49.78 fps. en_US
dc.identifier.doi 10.34248/bsengineering.1736319
dc.identifier.issn 2619-8991
dc.identifier.uri https://hdl.handle.net/123456789/1508
dc.identifier.uri https://search.trdizin.gov.tr/en/yayin/detay/1382587
dc.language.iso en
dc.relation.ispartof Black Sea Journal of Engineering and Science
dc.rights info:eu-repo/semantics/openAccess
dc.subject Bilgisayar Bilimleri, Yazılım Mühendisliği
dc.subject Bilgisayar Bilimleri, Yapay Zeka
dc.title Evaluation of Cutting-Edge Object Detection Architectures on Multi-Object and Single-Object Datasets en_US
dc.type Article
dspace.entity.type Publication
gdc.author.id 0000-0002-5500-7379
gdc.author.institutional Parlak, Cevahir
gdc.description.department
gdc.description.departmenttemp [(phd), Cevahir Parlak] Fenerbahçe Üniversitesi, Mühendislik Ve Doğa Bilimleri Fakültesi, Bilgisayar Mühendisliği Bölümü, İstanbul, Türkiye
gdc.description.endpage 294
gdc.description.issue 1
gdc.description.publicationcategory Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı
gdc.description.startpage 287
gdc.description.volume 9
gdc.identifier.trdizinid 1382587
gdc.index.type TR-Dizin
gdc.virtual.author Parlak, Cevahir
relation.isAuthorOfPublication d57697e4-e3d0-4c57-bb5c-4b92ceb92d1a
relation.isAuthorOfPublication.latestForDiscovery d57697e4-e3d0-4c57-bb5c-4b92ceb92d1a

Files