Presentation + Paper
17 October 2023 Remote sensing scene classification with masked image modeling
Liya Wang, Alex Tien
Author Affiliations +
Abstract
Remote sensing scene classification has been extensively studied for its critical roles in geological survey, oil exploration, traffic management, earthquake prediction, wildfire monitoring, and intelligence monitoring. In the past, the Machine Learning (ML) methods for performing the task mainly used the backbones pretrained in the manner of supervised learning (SL). As Masked Image Modeling (MIM), a self-supervised learning (SSL) technique, has been shown as a better way for learning visual feature representation, it presents a new opportunity for improving ML performance on the scene classification task. This research aims to explore the potential of MIM pretrained backbones on four well-known classification datasets: Merced, AID, NWPU-RESISC45, and Optimal-31. Compared to the published benchmarks, we show that the MIM pretrained Vision Transformer (ViTs) backbones outperform other alternatives (up to 18% on top 1 accuracy) and that the MIM technique can learn better feature representation than the supervised learning counterparts (up to 5% on top 1 accuracy). Moreover, we show that the general-purpose MIM-pretrained ViTs can achieve competitive performance as the specially designed yet complicated Transformer for Remote Sensing (TRS) framework. Our experiment results also provide a performance baseline for future studies.
Conference Presentation
(2023) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Liya Wang and Alex Tien "Remote sensing scene classification with masked image modeling", Proc. SPIE 12732, Microwave Remote Sensing: Data Processing and Applications II, 1273202 (17 October 2023); https://doi.org/10.1117/12.2680898
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Machine learning

Education and training

RGB color model

Remote sensing

Scene classification

Transformers

Image classification

Back to Top