Precise and robust multimodal image registration helps in spatially aligning medical finding from multiple sources for accurate medical diagnosis. We present MedRegNet, a lightweight descriptor module for registration multimodal retinal images. We utilize generative adversary networks (GANs) to learn generator networks which during training can synthesize structurally consistent multimodal image-pairs. From these image-pairs MedRegNet learns to predict stable point descriptors via a relaxed ranking loss. As no dataset-specific incentives are given, MedRegNet learns stable representations by itself. Most importantly, both mono- and multimodal training – including training the GANs – is entirely unsupervised, hence no costly expert annotation is required. Through evaluation on the publicly available Fundus Image Registration Dataset (FIRE) as well as on our own multimodal dataset containing 340 retinal fundus, autofluorescence and fluorescein angiography image-pairs from 24 patients we show MedRegNet to improve robustness and registration capabilities of classical detector/descriptor algorithms like SIFT, ORB, KAZE and AKAZE. Despite using the same interest points, MedRegNet is matching more points successfully than either baseline even in the monomodal use case. In the multimodal case, where the classical baselines fail due to large visual difference between the modalities, MedRegNet’s registration performance stays consistent to the monomodal case. Furthermore, MedRegNet shows to be adaptable to point detectors it was not trained on with no or only little cost in performance. MedRegNet can easily be integrated into any feature-based registration pipeline and due to the lack of dataset-specific incentives has the potential to be applied to fields outside retinal imaging.
|