Entity Linking based on RoFormer-Sim for Chinese Short Texts
DOI:
https://doi.org/10.54097/fcis.v4i1.9422Keywords:
Entity Linking, Candidate Entity, Text SimilarityAbstract
Entity linking is an important means to identify named entities in text and a key technology for constructing knowledge graphs, playing an important role in fields such as intelligent question answering and information retrieval. However, existing entity linking methods for short texts have low accuracy due to the lack of rich contextual information, informal expression, and incomplete grammar structures. Therefore, this paper proposes a short-text entity linking model based on the RoFormer-Sim pre-training model. Firstly, entity context features are extracted by the RoFormer-Sim pre-training model, and then text similarity calculation and sorting are performed with candidate entity description texts to obtain the corresponding entity in the knowledge base with the disambiguated entity. The experimental results show that the RoFormer-Sim model can provide prior knowledge for entity linking, and the proposed model in this paper has an F1 value of 0.8851, which is better than other entity linking models based on other pre-training models.
Downloads
References
ZHAN Fei, ZHU Yanhui, LIANG Wentong, et al. Multi-task learning-based short text entity linking method. Computer Engineering, 2022, 48(3): 315-320.
Zhang Shengqi, Wang Yuanlong, Li Ru, et al. Chinese short text entity linking based on local attention mechanism [J]. Computer Engineering, 2021, 47(11): 77-83, 92.
Lample G, Ballesteros M, Subramanian S, et al. Neural architectures for named entity recognition[C]. North American Chapter Of The Association For Computational Linguistics, 2016: 260-270.
PETERS M E,NEUMANN M,IYYER M,et al. Deep contextualized word representations. NAACL-HLT[J]. 2018. [J]. ar Xiv preprint ar Xiv:1802.05365,2018.
LeCun, Yann, et al. "Gradient-based learning applied to document recognition." Proceedings of the IEEE 86.11 (1998): 2278-2324.
Lipton, Zachary C., John Berkowitz, and Charles Elkan. "A critical review of recurrent neural networks for sequence learning." arXiv preprint arXiv:1506.00019 (2015).
Devlin J, Chang M W, LeeK, et al. BERT: pre- training of deep bidirectional transformers for language understanding [J]. arXiv preprint, arXiv: 1810. 04805,2018.
Ilić, Suzana, et al. "Deep contextualized word representations for detecting sarcasm and irony." arXiv preprint arXiv: 1809. 09795 (2018).
Zhou Pengcheng, Wu Chuan, Lu Wei. Short Text Entity Linking Based on Multiple Knowledge Bases: A Case Study of Wikipedia and Freebase[J]. New Technology of Library and Information Service, 2016, 32(06): 1-11.
Zeng W, Tang J, Zhao X. Entity linking on Chinese microblogs via deep neural network[J]. IEEE Access, 2018, 6: 25908-25920.
Hu S, Tan Z, Zeng W, et al. Entity linking via symmetrical attention-based neural network and entity structural features[J]. Symmetry, 2019, 11(4): 453.
Devlin J, Chang M W, Lee K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding[J]. ar Xiv preprint ar Xiv:1810.04805, 2018.
Chen S, Wang J, Jiang F, et al. Improving entity linking by modeling latent entity type information[C]//Proceedings of the Advancement of Artificial Intelligence(AAAI). 2020, 34(05): 7529-7537.
Cheng J, Pan C, Dang J, et al. Entity linking for Chinese short texts based on BERT and entity name embeddings[C]//China Conference on Knowledge Graph and Semantic Computing. 2019, 2: 1-12.
Zhao Y, Wang Y, Yang N. Chinese Short Text Entity Linking Based on Semantic Similarity and Entity Correlation[C]//2020 IEEE 32nd International Conference on Tools with Artificial Intelligence (ICTAI). IEEE, 2020: 426-431.
BERANT J,LIANG P. Semantic Parsing via Paraphrasing[C]// 52nd Annual Meeting of the Association for Computational Linguistics. Baltimore: Association for Computational Linguistics, 2014: 1415-1425.
Li Fei. Research on entity disambiguation method for knowledge graph based on deep learning [D]. Chang'an University, 2021. DOI: 10.26976/d.cnki.gchau.2021.001439.


