The task of classifying small objects is still challenging for current deep learning classification models [such as convolutional neural networks (CNNs) and vision transformers (ViTs)]. We believe that these algorithms are not designed specifically for small targets, so their feature extraction abilities for small targets are insufficient. To improve the classification capabilities of CNN-based and ViT-based classification models for small objects, two multidomain feature fusion (MDFF) frameworks are proposed to increase the amount of feature information derived from images and they are called MDFF-ConvMixer and MDFF-ViT. Compared with the basic model, the uniquely added design includes frequency domain feature extraction and MDFF processes. In the frequency domain feature extraction part, the input image is first transformed into a frequency domain form through discrete cosine transform (DCT) transformation and then a three-dimensional matrix containing the frequency domain information is obtained via channel splicing and reshaping. In the MDFF part, MDFF-ConvMixer splices the spatial and frequency domain features by channel, whereas MDFF-ViT uses a cross-attention mechanism to fuse the spatial and frequency domain features. When targeting small target classification tasks, these two frameworks obviously improve the utilized classification algorithm. On the DOTA dataset and the CIFAR10 dataset with two downsampling operations, the accuracies of MDFF-ConvMixer relative to ConvMixer increase from 87.82% and 62.14% to 90.14% and 66.00%, respectively, and the accuracies of MDFF-ViT relative to the ViT increase from 79.22% and 36.2% to 88.15% and 59.23%, respectively.
This paper describes a new RXDMTD algorithm based on RX anomaly detection for moving weak and small targets in multispectral image sequences. The proposed algorithm can effectively suppress background clutter and at the same time enhance the moving weak and small targets in multispectral and out-of-time image sequences. The complex background intensity between the two multispectral images changes significantly, which makes it difficult to suppress the background and difficult to extract the target. In this paper, the image sequence is first arranged and combined, and then the RX algorithm is used to enhance the target and using the target’s movement suppresses the background. Experimental results show that the algorithm proposed in this paper has achieved good detection results.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.