The detection of anatomical structures in medical imaging data plays a crucial role as a preprocessing step for various downstream tasks. It, however, poses a significant challenge due to highly variable appearances and intensity values within medical imaging data. In addition, there is a scarcity of annotated datasets in medical imaging data, due to high costs and the requirement for specialized knowledge. These limitations motivate researchers to develop automated and accurate few-shot object detection approaches. While there are generalpurpose deep learning models available for detecting objects in natural images, the applicability of these models for medical imaging data remains uncertain and needs to be validated. To address this, we carry out an unbiased evaluation of the state-of-the-art few-shot object detection methods for detecting head and neck anatomy in CT images. In particular, we choose Query Adaptive Few-Shot Object Detection (QA-FewDet), Meta Faster R-CNN, and Few-Shot Object Detection with Fully Cross-Transformer (FCT) methods and apply each model to detect various anatomical structures using novel datasets containing only a few images, ranging from 1- to 30-shot, during the fine-tuning stage. Our experimental results, carried out under the same setting, demonstrate that few-shot object detection methods can accurately detect anatomical structures, showing promising potential for integration into the clinical workflow.
Multimodal Magnetic Resonance (MR) Imaging plays a crucial role in disease diagnosis due to its ability to provide complementary information by analyzing a relationship between multimodal images on the same subject. Acquiring all MR modalities, however, can be expensive, and, during a scanning session, certain MR images may be missed depending on the study protocol. The typical solution would be to synthesize the missing modalities from the acquired images such as using generative adversarial networks (GANs). Yet, GANs constructed with convolutional neural networks (CNNs) are likely to suffer from a lack of global relationships and mechanisms to condition the desired modality. To address this, in this work, we propose a transformer-based modality infuser designed to synthesize multimodal brain MR images. In our method, we extract modality-agnostic features from the encoder and then transform them into modality-specific features using the modality infuser. Furthermore, the modality infuser captures long-range relationships among all brain structures, leading to the generation of more realistic images. We carried out experiments on the BraTS 2018 dataset, translating between four MR modalities, and our experimental results demonstrate the superiority of our proposed method in terms of synthesis quality. In addition, we conducted experiments on a brain tumor segmentation task and different conditioning methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.