Episode-based training strategy for zero-shot semantic segmentation

Bo Xiong; Jianming Liu; Zhuoxun Jing

doi:10.1117/12.2680262

27 June 2023 Episode-based training strategy for zero-shot semantic segmentation

Bo Xiong, Jianming Liu, Zhuoxun Jing

Proceedings Volume 12705, Fourteenth International Conference on Graphics and Image Processing (ICGIP 2022); 127051O (2023) https://doi.org/10.1117/12.2680262
Event: Fourteenth International Conference on Graphics and Image Processing (ICGIP 2022), 2022, Nanjing, China

Abstract

We introduce episode-based training into zero-shot semantic segmentation (ZS3) for the first time. In particular, the model is trained on a set of simulated ZS3 tasks. The model gains the ability to predict simulated unseen classes over multiple episodes, which generalizes well to true unseen classes after training on multiple episodes. On the basis of this training framework, we propose a visual semantic alignment network named VSAN as our basic model, which mainly includes a feature extractor and semantic projection network. This base model constrains the visual-semantic distribution of the same class by a distance measure between visual features and semantic prototypes. In the inference phase, the trained projection network can generate corresponding semantic prototypes for all classes, and predict the segmentation results of the entire image by measuring the distance between visual features and semantic prototypes. The base model VSAN is called EB-VSAN after using the episode-based training strategy. Our model is a discriminative model, as opposed to generative methods, our model does not need retraining the classifier when a new class emerges, while avoiding multi-stage training. Our extensive ZS3 experiments on the benchmark dataset show that the EB-VSAN model outperforms current state-of-the-art methods, specifically, our hIoU metric outperforms state-of-the-art methods by an average of 1.7% on PASCAL VOC, while Average 2.5% improvement on PASCAL Context.

Citation Download Citation

Bo Xiong, Jianming Liu, and Zhuoxun Jing "Episode-based training strategy for zero-shot semantic segmentation", Proc. SPIE 12705, Fourteenth International Conference on Graphics and Image Processing (ICGIP 2022), 127051O (27 June 2023); https://doi.org/10.1117/12.2680262

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available

Members: $17.00

Non-members: $21.00 ADD TO CART

PROCEEDINGS
12 PAGES

DOWNLOAD PAPER SAVE TO MY LIBRARY

GET CITATION

RIGHTS & PERMISSIONS

Get copyright permission Get copyright permission on Copyright Marketplace

KEYWORDS

Semantics

Education and training

Visualization

Prototyping

Visual process modeling

Image segmentation

Feature extraction

Show All Keywords

Keywords/Phrases

Search In:

Publication Years