A text-driven image style transfer model based on CLIP and SCBAM

Haodong Wu; Guohua Geng; Yanting Zhao; Xiaolei Wang; Qihang Li

doi:10.1117/12.3006664

10 October 2023 A text-driven image style transfer model based on CLIP and SCBAM

Haodong Wu, Guohua Geng, Yanting Zhao, Xiaolei Wang, Qihang Li

Proceedings Volume 12799, Third International Conference on Advanced Algorithms and Signal Image Processing (AASIP 2023); 127994H (2023) https://doi.org/10.1117/12.3006664
Event: 3rd International Conference on Advanced Algorithms and Signal Image Processing (AASIP 2023), 2023, Kuala Lumpur, Malaysia

Abstract

To address the issues of low image quality and inadequate detail features encountered in current zero-shot style transfer algorithms, we propose a new text-driven image style transfer model. The model first uses CLIP (Contrastive LanguageImage Pre-Training) model to convey the semantic information of text conditions. In addition, We designed a lightweight network, which can quickly express texture information according to text conditions, minimize the similarity cosine distance between the transferred image and the text conditions by CLIP model, and finally obtain the style transfer image. Furthermore, we introduce dual attention mechanism, identity consistency loss, content and style feature loss to make the translated image more vivid and realistic. Extensive experimental results demonstrate that our approach enables the transfer of multiple styles based on text conditions, achieving a broader, more realistic, and faster style transfer compared to existing methods.

(2023) Published by SPIE. Downloading of the abstract is permitted for personal use only.

Citation Download Citation

Haodong Wu, Guohua Geng, Yanting Zhao, Xiaolei Wang, and Qihang Li "A text-driven image style transfer model based on CLIP and SCBAM", Proc. SPIE 12799, Third International Conference on Advanced Algorithms and Signal Image Processing (AASIP 2023), 127994H (10 October 2023); https://doi.org/10.1117/12.3006664

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available

Members: $17.00

Non-members: $21.00 ADD TO CART

PROCEEDINGS
7 PAGES

DOWNLOAD PAPER SAVE TO MY LIBRARY

GET CITATION

RIGHTS & PERMISSIONS

Get copyright permission Get copyright permission on Copyright Marketplace

KEYWORDS

Image quality

Semantics

Feature extraction

Data modeling

Education and training

Network architectures

Visualization

Show All Keywords

Keywords/Phrases

Search In:

Publication Years