Many artificial intelligence algorithms are currently deployed in medicine to help decision making and treatment planning. As new data become available, these algorithms can be updated with the goal to increase performance and improve patient outcomes. One method for updating algorithms is incrementally training on newly available data in multiple steps following sequential training protocols. For segmentation algorithms, changing the class labels of background pixels is known as semantic distribution shift, which can decrease algorithm knowledge retention. In this work, we explore the effects of semantic distribution shift on the knowledge retention of sequentially trained tumor segmentation algorithms by systematically altering the reference standards to simulate annotations by different annotators. Algorithms were trained in two sequential steps. The first step used an unmodified reference standard and the second step training data had a modified reference standard. The modified reference standards, which simulate annotator over- and under-segmentation, were created by systematically dilating or eroding the reference standard with different kernel sizes. A baseline algorithm is trained with unmodified data in both steps. Two types of variability are explored. The first is homogeneous variability, where the modified reference standards are changed in a consistent manner; for example, all reference standard segmentations were dilated or eroded with the same dilation or erosion structural element. The second is heterogeneous variability, in which the modified training data contains a mixture of dilated and eroded reference standards. Algorithm performance is evaluated using the Dice Coefficient, 95% Hausdorff Distance, and Volume Distance Coefficient, an adaptation of the Volume Similarity Coefficient. For the gadolinium-enhancing tumor subregion, algorithms trained with homogeneous variability had Dice values ranging from 0.38±0.01 to 0.76±0.02, while algorithms trained with heterogeneous variability had Dice values ranging from 0.76±0.02 to 0.79±0.01, much closer to the baseline algorithm’s value of 0.81±0.03. The results show that semantic distribution shift due to homogeneous variability decreases algorithm knowledge retention more than the shift due to heterogeneous variability. In practice, this suggests that the potential decrease in knowledge retention resulting from the addition of new data can be minimized by ensuring that the new data comes from a diverse variety of sources.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.