Paper
18 November 2024 Explainable text representation method with a learnable and explicit semantic space
Bianfang Chai, Jiaxin Liu, Xiaopeng Zhao
Author Affiliations +
Proceedings Volume 13403, International Conference on Algorithms, High Performance Computing, and Artificial Intelligence (AHPCAI 2024) ; 134031G (2024) https://doi.org/10.1117/12.3051680
Event: International Conference on Algorithms, High Performance Computing, and Artificial Intelligence, 2024, Zhengzhou, China
Abstract
Word embedding is a widely used method for representing words in contemporary natural language processing, where words are depicted as low-dimensional dense vectors. However, the features of each dimension in the vector are difficult to interpret. Most existing interpretable word embedding methods enhance the interpretability of word embeddings through orthogonal or sparse transformations. Still, the semantics of each dimension need to be determined according to the knowledge base or manually assigned after transformation. Embedded topic model (ETM) can automatically acquire interpretable semantic space of words, but this topic space tends to be ambiguous and redundant. To address this issue, this paper proposes an Explainable Text Representation method with a Learnable and Explicit Semantic Space (ETRLESS), which autonomously learns an orthogonal explicit semantic space, allowing words and documents to be represented as interpretable vectors within this space. It obtains the document embedding representation through a BiLSTM model, initializes topic embeddings through pre-training ETM model, and imposes orthogonal constraints on the topic embedding to obtain a more interpretable topic semantic space. By using the reconstruction loss of documents, the document-topic distribution loss, and the orthogonal loss of topic embeddings as optimization objectives, it employs the backpropagation algorithm to learn an interpretable, orthogonal explicit topic semantic space and word representations based on this space. The results demonstrate that the embeddings generated by the ETRLESS model have clear semantic information in each dimension and maintain performance in downstream tasks.
(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Bianfang Chai, Jiaxin Liu, and Xiaopeng Zhao "Explainable text representation method with a learnable and explicit semantic space", Proc. SPIE 13403, International Conference on Algorithms, High Performance Computing, and Artificial Intelligence (AHPCAI 2024) , 134031G (18 November 2024); https://doi.org/10.1117/12.3051680
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Semantics

Back to Top