PurposeThe prevalence of type 2 diabetes mellitus (T2DM) has been steadily increasing over the years. We aim to predict the occurrence of T2DM using mammography images within 5 years using two different methods and compare their performance.ApproachWe examined 312 samples, including 110 positive cases (developed T2DM after 5 years) and 202 negative cases (did not develop T2DM) using two different methods. In the first method, a radiomics-based approach, we utilized radiomics features and machine learning (ML) algorithms. The entire breast region was chosen as the region of interest for extracting radiomics features. Then, a binary breast image was created from which we extracted 668 features and analyzed them using various ML algorithms. In the second method, a complex convolutional neural network (CNN) with a modified ResNet architecture and various kernel sizes was applied to raw mammography images for the prediction task. A nested, stratified five-fold cross-validation was done for both parts A and B to compute accuracy, sensitivity, specificity, and area under the receiver operating curve (AUROC). Hyperparameter tuning was also done to enhance the model’s performance and reliability.ResultsThe radiomics approach’s light gradient boosting model gave 68.9% accuracy, 30.7% sensitivity, 89.5% specificity, and 0.63 AUROC. The CNN method achieved an AUROC of 0.58 over 20 epochs.ConclusionRadiomics outperformed CNN by 0.05 in terms of AUROC. This may be due to the more straightforward interpretability and clinical relevance of predefined radiomics features compared with the complex, abstract features learned by CNNs.
The incidence rate for Type 2 Diabetes Mellitus (T2DM) has been increasing over the years. T2DM is a common lifestyle-related disease and predicting its occurrence before five years could help patients to alter their lifestyle ahead and hence prevent T2DM. We intend to investigate the feasibility of radiomics features in predicting the occurrence of T2DM using screening mammography images which could benefit us in terms of the preventability of the disease. This study has examined the prevalence of T2DM using 110 positive samples (developed T2DM after 5 years) and 202 negative samples (did not develop T2DM after five years). The whole breast region was selected as the Region Of Interest (ROI), from which radiomics features were to be extracted. The mask was created from every image using a modified threshold value (by Otsu's binarization method) to obtain a binary image of the breast. 668 radiomics features were then extracted and analyzed using different machine learning algorithms built in the Python programming language such as Random Forest (RF), Gradient Boosting Classifier (GBC), and Light-Gradient Boosting Model (LGBM) as they could give excellent classification and prediction results. A five-fold cross-validation method was carried out; the accuracy, sensitivity, specificity and AUC were calculated when implementing each of the algorithms, and hyperparameter tuning was carried out to tune the models for better performance. The RF and GBC produced good accuracy results (⪆ 70%), but low sensitivity values. LGBM’s accuracy is almost 70% but it has the highest sensitivity (43.9%) and decent specificity (74.4%).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.