Evaluation of Cutting-Edge Object Detection Architectures on Multi-Object and Single-Object Datasets
| dc.contributor.author | Parlak, Cevahir | |
| dc.date.accessioned | 2026-05-12T15:03:58Z | |
| dc.date.available | 2026-05-12T15:03:58Z | |
| dc.date.issued | 2026 | |
| dc.description.abstract | This study focuses on the performance evaluation of cutting-edge object detection models, namely, YOLO12X, Mask R-CNN, RT-DETR-X, and RF-DETR-Large on the Open Images (Multi-Object) and LaSOT (Single-Object) datasets. Current cutting-edge trend applications involve CNN-based and Transformer-based object detection models. CNN-based models can use one-pass (YOLO family) or two-pass (R-CNN family) implementations. One-pass object detection models can be faster but suffer from accuracy compared to the two-pass models. Transformer-based models can use Detection Transformers or Vision Transformers. Transformer-based models are gaining popularity, and their performance surpasses CNN-based models. This study evaluates YOLO12X, Mask R-CNN from CNN-based family, and RT-DETR-X, RF-DETR-Large transformer-based architectures in terms of accuracy and time on the Open Images and the LaSOT datasets. All models are the largest available models and pretrained on COCO dataset. Transformer-based models incorporate special types of self-attention and pose significant improvement both on accuracy and speed. The experimental results demonstrate that attention and transformer-based models perform better than the traditional CNN-based object detectors and YOLO12X is the fastest method with a far margin. On the LaSOT dataset, RT-DETR-X posts 0.8804 IoU, 0.7047 F1-score, 0.6597 mAP@0.5, 28.64 fps whereas YOLO12X achieves 0.8572 IoU, 0.6657 F1-score, 0.5357 mAP@0.5, and 49.78 fps. | en_US |
| dc.identifier.doi | 10.34248/bsengineering.1736319 | |
| dc.identifier.issn | 2619-8991 | |
| dc.identifier.uri | https://hdl.handle.net/123456789/1508 | |
| dc.identifier.uri | https://search.trdizin.gov.tr/en/yayin/detay/1382587 | |
| dc.language.iso | en | |
| dc.relation.ispartof | Black Sea Journal of Engineering and Science | |
| dc.rights | info:eu-repo/semantics/openAccess | |
| dc.subject | Bilgisayar Bilimleri, Yazılım Mühendisliği | |
| dc.subject | Bilgisayar Bilimleri, Yapay Zeka | |
| dc.title | Evaluation of Cutting-Edge Object Detection Architectures on Multi-Object and Single-Object Datasets | en_US |
| dc.type | Article | |
| dspace.entity.type | Publication | |
| gdc.author.id | 0000-0002-5500-7379 | |
| gdc.author.institutional | Parlak, Cevahir | |
| gdc.description.department | ||
| gdc.description.departmenttemp | [(phd), Cevahir Parlak] Fenerbahçe Üniversitesi, Mühendislik Ve Doğa Bilimleri Fakültesi, Bilgisayar Mühendisliği Bölümü, İstanbul, Türkiye | |
| gdc.description.endpage | 294 | |
| gdc.description.issue | 1 | |
| gdc.description.publicationcategory | Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı | |
| gdc.description.startpage | 287 | |
| gdc.description.volume | 9 | |
| gdc.identifier.trdizinid | 1382587 | |
| gdc.index.type | TR-Dizin | |
| gdc.virtual.author | Parlak, Cevahir | |
| relation.isAuthorOfPublication | d57697e4-e3d0-4c57-bb5c-4b92ceb92d1a | |
| relation.isAuthorOfPublication.latestForDiscovery | d57697e4-e3d0-4c57-bb5c-4b92ceb92d1a |
