A multimodal resource balancing approach based on cross-attention and bidrectional associative memory networks

Xingang Wang; Honglu Cheng; Guangzheng Liu; Xiaoyu Liu

doi:10.1117/12.3010753

20 October 2023 A multimodal resource balancing approach based on cross-attention and bidrectional associative memory networks

Xingang Wang, Honglu Cheng, Guangzheng Liu, Xiaoyu Liu

Author Affiliations +

Proceedings Volume 12814, Third International Conference on Green Communication, Network, and Internet of Things (CNIoT 2023); 128142E (2023) https://doi.org/10.1117/12.3010753
Event: Third International Conference on Green Communication, Network, and Internet of Things (CNIoT 2023), 2023, Chongqing, China

Abstract

In the event of a disaster, mining social media tweets containing disaster information to analyze the dynamics of the disaster can help relevant authorities make quick emergency decisions and public opinion analysis. The data on social media jointly describe a thing to make it semantically related, but there are often structural and semantic imbalances between heterogeneous modalities. Most of the current research is devoted to formal complementation or semantic balancing, and multi-granularity resource balancing is beneficial to better obtain consistent information of heterogeneous modalities and remove redundancy. Based on the above problems, this paper proposes an end-to-end Multimodal Resource Balancing (MMRB) model, which designs cross-attention and two-way associative memory network modules while avoiding the loss of important information of unimodal modalities. The cross-attention module is used to capture the deep semantic correlations between modalities and weight the modal features to achieve semantic-level resource balancing. The associative memory module uses each modal feature to generate a complete feature representation of another modality on the pre-trained migration model to complement multimodal information at the feature level to improve the modal imbalance and modal missing problem of unstructured social media data. Finally, the fused features are weighted by the joint feature codec to reasonably focus on the contribution of different modal features in the joint features and the intrinsic association. Experiments are conducted on CrisisMMD, a public data of social media graphics used for disaster detection, and the accuracy of the model is stronger than the current multimodal resource balancing model, which verifies the effectiveness of our model.

(2023) Published by SPIE. Downloading of the abstract is permitted for personal use only.

Citation Download Citation

Xingang Wang, Honglu Cheng, Guangzheng Liu, and Xiaoyu Liu "A multimodal resource balancing approach based on cross-attention and bidrectional associative memory networks", Proc. SPIE 12814, Third International Conference on Green Communication, Network, and Internet of Things (CNIoT 2023), 128142E (20 October 2023); https://doi.org/10.1117/12.3010753

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available

Members: $17.00

Non-members: $21.00 ADD TO CART

PROCEEDINGS
9 PAGES

DOWNLOAD PAPER SAVE TO MY LIBRARY

GET CITATION

RIGHTS & PERMISSIONS

Get copyright permission Get copyright permission on Copyright Marketplace

KEYWORDS

Data modeling

Content addressable memory

Education and training

Feature extraction

Matrices

Web 2.0 technologies

Performance modeling

Show All Keywords

Keywords/Phrases

Search In:

Publication Years