Data shift, also known as dataset shift, is a prevalent concern in the field of machine learning. It occurs when the distribution of the data used for training a machine learning model is different from the distribution of the data the model will encounter in a real-world, operational environment (i.e., test set). This issue becomes even more significant in the field of medical imaging due to the multitude of factors that can contribute to data shifts. It is crucial for medical machine learning systems to identify and address these issues. In this paper, we present an automated pipeline designed to identify and alleviate certain types of data shift issues in medical imaging datasets. We intentionally introduce data shift into our dataset to assess and address it within our workflow. More specifically, we employ Principal Components Analysis (PCA) and Maximum Mean Discrepancy (MMD) algorithms to detect data shift between the training and test datasets. We utilize image processing techniques, including data augmentation and image registration methods, to individually and collectively mitigate data shift issues and assess their impacts. In the experiments we use a head CT image dataset of 537 patients with severe traumatic brain injury (sTBI) for patient outcome prediction. Results show that our proposed method is effective in detecting and significantly improving model performance.
KEYWORDS: Breast density, Image classification, Education and training, Mammography, Deep learning, Surgery, Medical imaging, Machine learning, Statistical modeling, Breast
Classification of Breast Imaging Reporting and Data System (BI-RADS) breast density categories generally reflects the amount of dense/fibroglandular tissue in the breast. Studies have consistently shown that breast with higher density has a higher risk of developing breast cancer compared to breast with lower density. In this paper, we propose a novel end-to-end method, namely, Medical Knowledge-guided Deep Learning (MKDL), for breast mammogram density classification. The principle behind MKDL lies in the fact that many breast image density classification tasks are partly or largely based on certain pre-known image features, such as image contrast and brightness. These pre-known features can be computationally represented and then leveraged as prior knowledge to facilitate more effective model learning and thus boost the classification performance. We designed specific knowledge-based transformations for breast density classification and showed that our model outperformed several state-of-the-art models.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.