Open Access Paper
12 November 2024 Self-attention model based on multiscale convolution
Zhen Lin, Pengfei Xiao, Rong Guan, Zhining You, Yunming Pu
Author Affiliations +
Proceedings Volume 13395, International Conference on Optics, Electronics, and Communication Engineering (OECE 2024) ; 133951K (2024) https://doi.org/10.1117/12.3048367
Event: International Conference on Optics, Electronics, and Communication Engineering, 2024, Wuhan, China
Abstract
In the field of deep learning, convolutional neural networks and transformer architecture have achieved considerable success. This paper combines the advantages of these two architectures to achieve the encoding of multi-scale spatial features in an image. This is done by using convolutional kernels of different sizes for convolutional operations at each stage. This allows the model to obtain markers with rich and diverse features. The self-attention mechanism is then used to further improve the feature representation and introduce residual links. The experimental results demonstrate that the proposed model exhibits robust performance on the CIFAR-100 and CIFAR-10 datasets, with comparable performance and fewer parameters compared to traditional CNN models.
(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Zhen Lin, Pengfei Xiao, Rong Guan, Zhining You, and Yunming Pu "Self-attention model based on multiscale convolution", Proc. SPIE 13395, International Conference on Optics, Electronics, and Communication Engineering (OECE 2024) , 133951K (12 November 2024); https://doi.org/10.1117/12.3048367
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Transformers

Convolution

RGB color model

Data modeling

Feature extraction

Performance modeling

Convolutional neural networks

Back to Top