A Data-Efficient Machine Learning Approach for Breast Ultrasound Lesion Classification Integrating Image-Derived Features and Sonographic Descriptors
| dc.contributor.author | Karacor, Adil Gursel | |
| dc.contributor.author | Sahin, Sevim | |
| dc.date.accessioned | 2026-05-12T14:56:18Z | |
| dc.date.available | 2026-05-12T14:56:18Z | |
| dc.date.issued | 2026 | |
| dc.description.abstract | Background/Objectives: Breast ultrasound is widely used for the diagnostic evaluation of breast lesions; however, reliable lesion characterization remains challenging due to substantial image heterogeneity and the limited size of most clinically available datasets. These constraints reduce the generalizability of end-to-end deep learning approaches in routine practice. The objective of this study was to evaluate a data-efficient diagnostic framework that integrates image-derived features with clinical sonographic descriptors to improve breast ultrasound lesion classification in small cohorts. Methods: Ultrasound images from the publicly available BrEaST-Lesions dataset were processed using a pretrained convolutional neural network to extract compact image feature representations from full images, lesion masks, and cropped tumor regions. These features were combined with manually recorded sonographic descriptors after label encoding to form a unified tabular dataset. Gradient-boosted tree models were trained using descriptor-only and fused feature sets with fivefold stratified cross-validation and evaluated on an independent external hold-out test set. Results: Using sonographic descriptors alone, the best-performing model (LightGBM) achieved an external validation accuracy of 0.88, with an area under the receiver operating characteristic curve (AUC) of 0.95. Incorporation of image-derived features improved diagnostic performance on the external test set, yielding an accuracy of 0.88, an AUC of 0.96, and a sensitivity of 1.00 for malignant lesion detection. The fused framework demonstrated more stable generalization than descriptor-only models, particularly for malignant cases. Conclusions: Combining image-derived features with clinical sonographic descriptors within a tabular learning framework provides a robust and data-efficient approach for breast ultrasound-based lesion classification. This strategy supports diagnostic decision-making in small ultrasound datasets and represents a clinically realistic alternative when large-scale deep learning models are impractical. | |
| dc.identifier.doi | 10.3390/diagnostics16050664 | |
| dc.identifier.issn | 2075-4418 | |
| dc.identifier.scopus | 2-s2.0-105032560379 | |
| dc.identifier.uri | https://hdl.handle.net/123456789/1483 | |
| dc.identifier.uri | https://doi.org/10.3390/diagnostics16050664 | |
| dc.language.iso | en | |
| dc.publisher | MDPI | |
| dc.relation.ispartof | Diagnostics | |
| dc.rights | info:eu-repo/semantics/openAccess | |
| dc.subject | Breast Ultrasound | |
| dc.subject | Feature Fusion | |
| dc.subject | Sonographic Descriptors | |
| dc.subject | Lesion Classification | |
| dc.subject | Small Datasets | |
| dc.subject | Diagnostic Decision Support | |
| dc.title | A Data-Efficient Machine Learning Approach for Breast Ultrasound Lesion Classification Integrating Image-Derived Features and Sonographic Descriptors | en_US |
| dc.type | Article | |
| dspace.entity.type | Publication | |
| gdc.author.scopusid | 16417519900 | |
| gdc.author.scopusid | 60173985900 | |
| gdc.description.department | ||
| gdc.description.departmenttemp | [Karacor, Adil Gursel] Fenerbahce Univ, Fac Engn & Nat Sci, Dept Ind Engn, TR-34758 Istanbul, Turkiye; [Sahin, Sevim] Fenerbahce Univ, Fac Engn & Nat Sci, Dept Elect & Elect Engn, TR-34758 Istanbul, Turkiye | |
| gdc.description.issue | 5 | |
| gdc.description.publicationcategory | Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı | |
| gdc.description.volume | 16 | |
| gdc.description.woscitationindex | Science Citation Index Expanded | |
| gdc.identifier.pmid | 41827939 | |
| gdc.identifier.wos | WOS:001713918500001 | |
| gdc.index.type | PubMed | |
| gdc.index.type | Scopus | |
| gdc.index.type | WoS | |
| gdc.virtual.author | Şahin, Sevim | |
| gdc.virtual.author | Karaçor, Adil Gürsel | |
| relation.isAuthorOfPublication | 137b9c99-3632-425b-a3e8-9dace6596145 | |
| relation.isAuthorOfPublication | 1dca77e3-d77c-4f1c-b940-947f84ac7f05 | |
| relation.isAuthorOfPublication.latestForDiscovery | 137b9c99-3632-425b-a3e8-9dace6596145 |
