KEYWORDS: Electrocardiography, Education and training, Performance modeling, Data modeling, Feature extraction, Reflection, Heart, Signal processing, Signal detection, Diagnostics
Background: Cardiovascular disease is one of the leading causes of death worldwide. Electrocardiogram(ECG) signals play a crucial role in diagnosing various heart conditions, including arrhythmias and myocardial infarctions. There is a need for a reliable and efficient method to quickly identify abnormal heartbeats to aid early diagnosis and treatment. Methods: The study utilized the MIT-BIH arrhythmia database, which includes 48 groups of two-lead ECG signals. High-dimensional features were extracted from the ECG signals using the ts fresh package in Python. Feature selection was performed using variance analysis, Spearman correlation, mRMR, and LASSO methods. Logistic regression models were then constructed to predict abnormal heartbeats. Results: The final model included 10 key features and demonstrated high diagnostic performance. The AUC was 0.958in the training set and 0.947 in the test set, with specificities of 0.930 and 0.851, and sensitivities of 0.881 and0.892, respectively. The model outperformed traditional methods and deep learning models such as CNN and VGG in identifying abnormal beats. Conclusions: This study presents a robust and effective nomogram model for distinguishing abnormal ECG signals, highlighting its significant clinical application potential. Future research will focus on expanding sample sizes and incorporating additional methods for feature calculation to further enhance model generalizability
KEYWORDS: Data modeling, Performance modeling, Machine learning, Education and training, Glucose, Deep learning, Feature selection, Diseases and disorders, Random forests, Plasma
In the field of medicine, disease prevention is more important than treatment. Diabetes, as one of the diseases that are harmful and have a large number of patients, the prediction of diabetes using learning models is an essential part of diabetes prevention and treatment in the future medical field. In this study, compound feature selection was used to screen out the eight features with the most predictive ability, and diabetes was predicted by six machine learning models and two deep learning models, and the final results were obtained as follows: the XGBoost classification had the best prediction performance, with an accuracy of 99.1%, a precision of 99.0%, a recall rate of 99.2%, an F1 score of 99.1%, and an AUC value of up to 0.991; CatBoost and LightGBM models have the next best performance. This is consistent with our previous findings. This study further confirms the value and potential of the XGBoost model for diabetes prediction, identifying superior feature selection methods compared to previous studies, and improving the predictive performance of the model while reducing model complexity when dealing with more complex data.
Purpose: The present study aimed to construct classification models for pulmonary adenocarcinoma using computed tomography (CT)‐based radiomics features and random forest method. Methods: A total of 289 patients with 295 lung adenocarcinomas were included in this study. A total of 1066 CT images were extracted. The final data set was randomized into the training set and validation set at the ratio of 80%:20%. A total of 1082 features were captured from a semi‐automatic segmentation method segmented lesion of a CT image. 9 optimal radiomic features obtained from root mean squared error (REMS) through cross validation and 14 radiographic characteristic features were selected to construct a random forest classification model. At the same time, compared with the results of the Support Vector Machine (SVM), Logistic Regression and C5.0 algorithm. Results: The area under the curve (AUC) scores of training feature set, radiographic characteristics feature set, and the optimal radiomic feature set for testing dataset were 0.974, 0.483, and 0.835, respectively, and the corresponding AUC values for validation dataset were 0.964, 0.915, and 0.841, separately. Conclusion: The developed random forest‐based classification models using radiomics features and radiographic features of CT showed a relatively acceptable performance in lung adenocarcinoma and could assist clinical rapid diagnosis and triage.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.