Named Entity Recognition of Ancient Wine Texts Based on Deep Learning Models


  • Wei Zhang
  • Yadong Wu
  • Weihan Zhang
  • Yuling Zhang
  • Xiang Ji



Text mining, Natural language processing, Named entity recognition, Neural network; BERT.


Named entity recognition of ancient wine canonical books helps to excavate the history of wine culture, inherit ancient Chinese wine culture and help modern wine industry. In this paper, the collection of ancient wine data was carried out through the web and books, and after analysis and discussion, five types of entities were selected for manual entity annotation: wine name, person name, place name, event name and time, and the corpus was trained and compared using machine learning method CRF and deep learning method LTSM with BERT pre-training model. BERT pre-training model has the best effect among all models, and the reconciliation of entity recognition The average is higher than other models, up to 88.33%. The subsequent annotation process and BERT model can be optimized or the sample corpus can be expanded to improve the training effect, and the entity annotation system of ancient Chinese text can also provide a reference for subsequent research.


Download data is not yet available.
<br data-mce-bogus="1"> <br data-mce-bogus="1">


Chen Yu Hou. Visual analysis of knowledge map of Sichuan wine culture research[J]. Journal of Sichuan Institute of Technology (Social Science Edition),2017,32(06):10-25.

Li Na. Construction of automatic extraction model for aliases of ancient books of Fangzhi based on conditional random fields[J]. Journal of Chinese Information,2018,32(11):41-48+61.

Zhao SH, Luo R, Cai ZP. A review of Chinese named entity recognition[J]. Computer Science and Exploration,2022,16(02):296-304.

Sui Chen. Research on Chinese named entity recognition based on deep learning [D]. Zhejiang University,2017.

Li, J., Wang, P.. A review of Chinese named entity recognition research methods[J]. Computer Age,2021(04):18-21.DOI:10.16644/j.cnki.cn33-1094/tp.2021.04.005.

Huang Shuiqing, Wang Dongbo, He Lin. Research on the construction of automatic recognition model of ancient Chinese place names based on pre-Qin corpus[J]. Library Intelligence Work,2015,59(12):135-140.DOI:10.13266/j.issn.0252-3116.2015.12.020.

Wang D.B., Gao R.Q., Shen S., Li B. Research on automatic identification of basic entity components of historical events for pre-Qin canonical texts[J]. National Library Journal,2018,27(01):65-77.DOI:10.13666/j.cnki.jnlc.2018.01.009.

Zhang Fan,Wang Min. Medical named entity recognition based on deep learning[J]. Computing Technology and Automation,2017,36(01):123-127.

Xie Tao. Research and implementation of named entity recognition based on ancient literature [D]. Beijing University of Posts and Telecommunications,2018.

He, Chun-Hui, Wang, Meng-Xian, He, Xiao-Bo. Named entity identification in diabetes domain based on two-layer Bi-LSTM-CRF model[J]. Journal of Shaoyang College (Natural Science Edition),2020,17(01):21-26.

Cui Jingfeng,Zheng Dejun,Wang Dongbo,Li Tingting. Named entity recognition of chrysanthemum classical poems based on deep learning model[J]. Intelligence Theory and Practice,2020,43(11):150-155.DOI:10.16353/j.cnki.1000-7490.2020.11.024.

Jinhyuk Lee,Wonjin Yoon,Sungdong Kim,Donghyeon Kim,Sunkyu Kim,Chan Ho So,Jaewoo Kang. BioBERT: a pre-trained biomedical language representation model for biomedical text mining.[J]. CoRR,2019,abs/1901.08746.

Naseem U, Musial K, Eklund P, et al. Biomedical named-entity recognition by hierarchically fusing biobert representations and deep contextual-level word-embedding [C]//2020 International joint conference on neural networks (IJCNN). IEEE, 2020: 1-8.

Sun Huan. Traffic accident text analysis based on BERT+BiLSTM+CRF model and improved Apriori algorithm [D]. Chang'an University, 2021. DOI:10.26976/d.cnki.gchau.2021.000506.




How to Cite

Zhang, W., Wu, Y., Zhang, W., Zhang, Y., & Ji, X. (2023). Named Entity Recognition of Ancient Wine Texts Based on Deep Learning Models. Academic Journal of Science and Technology, 4(2), 97–103.