Paper
10 October 2023 A text-driven image style transfer model based on CLIP and SCBAM
Haodong Wu, Guohua Geng, Yanting Zhao, Xiaolei Wang, Qihang Li
Author Affiliations +
Proceedings Volume 12799, Third International Conference on Advanced Algorithms and Signal Image Processing (AASIP 2023); 127994H (2023) https://doi.org/10.1117/12.3006664
Event: 3rd International Conference on Advanced Algorithms and Signal Image Processing (AASIP 2023), 2023, Kuala Lumpur, Malaysia
Abstract
To address the issues of low image quality and inadequate detail features encountered in current zero-shot style transfer algorithms, we propose a new text-driven image style transfer model. The model first uses CLIP (Contrastive LanguageImage Pre-Training) model to convey the semantic information of text conditions. In addition, We designed a lightweight network, which can quickly express texture information according to text conditions, minimize the similarity cosine distance between the transferred image and the text conditions by CLIP model, and finally obtain the style transfer image. Furthermore, we introduce dual attention mechanism, identity consistency loss, content and style feature loss to make the translated image more vivid and realistic. Extensive experimental results demonstrate that our approach enables the transfer of multiple styles based on text conditions, achieving a broader, more realistic, and faster style transfer compared to existing methods.
(2023) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Haodong Wu, Guohua Geng, Yanting Zhao, Xiaolei Wang, and Qihang Li "A text-driven image style transfer model based on CLIP and SCBAM", Proc. SPIE 12799, Third International Conference on Advanced Algorithms and Signal Image Processing (AASIP 2023), 127994H (10 October 2023); https://doi.org/10.1117/12.3006664
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Image quality

Semantics

Feature extraction

Data modeling

Education and training

Network architectures

Visualization

Back to Top