Paper
9 October 2023 Transformer with modified self-attention for flat-lattice Chinese NER
Xiaojun Bi, Congcong Zhao, Weizheng Qiao
Author Affiliations +
Proceedings Volume 12791, Third International Conference on Advanced Algorithms and Neural Networks (AANN 2023); 127911V (2023) https://doi.org/10.1117/12.3005002
Event: Third International Conference on Advanced Algorithms and Neural Networks (AANN 2023), 2023, Qingdao, SD, China
Abstract
As a fundamental task in the field of NLP, Chinese NER has attracted many researchers. Recently, the lattice structure supported by the auxiliary lexicon has been replaced by a new flat-lattice structure to accommodate the input of the Transformer encoder. The positional information in the flat-lattice is regarded as an information flow input into the calculation of self-attention. Due to the addition of this information flow, a variant of self-attention is used here. Although the acquisition of the distance feature relies on the positions of two tokens represented as characters or words in the flat-lattice, the current method only considers the interaction of the distance feature on the former token, rather than taking both tokens into account. In this paper, we propose a new variant of self-attention, which not only considers the influence of the distance feature on the former token but also considers the influence of the distance feature on the latter token. Experiments on two open datasets show that our proposed model outperforms other state-of-the-art models.
(2023) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Xiaojun Bi, Congcong Zhao, and Weizheng Qiao "Transformer with modified self-attention for flat-lattice Chinese NER", Proc. SPIE 12791, Third International Conference on Advanced Algorithms and Neural Networks (AANN 2023), 127911V (9 October 2023); https://doi.org/10.1117/12.3005002
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Transformers

Feature extraction

Education and training

Back to Top