Researches Advanced in the Development and Application of Information Extraction

Authors

  • Yijiao Liu

DOI:

https://doi.org/10.54097/hset.v16i.2501

Keywords:

Natural language processing; Information extraction; Relation extraction; Lexical analysis.

Abstract

Natural language processing (NLP) is an interdisciplinary subject of linguistics and computer science, which processes the language used for human communication into a machine language that can be understood by machines.The task of information extraction is to obtain target information accurately and quickly from a large amount of data and improve the utilization of information. With the rapid development of Internet applications, how to quickly and accurately analyze the really useful information in these text data is particularly critical and urgent. Information extraction has become an important branch of natural language processing. Thanks to the rapid development of deep learning, the performance of information extraction has made breakthroughs in recent years. In this paper, based on the detailed literature analysis, this paper first review the development of information extraction. Secondly, the research progress of key technologies of information extraction is summarized from four aspects: named entity recognition, anaphora resolution, relationship extraction and event extraction. Finally, we analyze some main problems of information extraction and predict the research trend of information extraction.

Downloads

Download data is not yet available.

References

Guo X Y, He T T, Hu X H, Chen Q J Chinese entity relation extraction based on syntactic semantic features [J]. Chinese Journal of information technology, 2014,28 (06): 183-189.

Golshan P N, Dashti H A R, Azizi S, et al.A Study of Recent Contributions on Information Extraction[J].2018.J.Piskorski and R.Yangarber.Information extraction.

J.Piskorski and R.Yangarber,“Information extraction:past,present and future,”in Multi-source,multilingual information extraction and summarization,Springer,pp.23-49,2013.

Cui L,Wei F,Zhou M.Neural Open Information Extraction[J].2018.

Liu Q, Jia H B Research and Prospect of automatic word segmentation technology in Chinese information processing [J]. Computer engineering and application, 2006, (03): 175-177+182.

Yuan L C. Syntax analysis method based on statistics [J]. Journal of Central South University (NATURAL SCIENCE EDITION), 2014, 08:2669-2675.

Zheng W F. A review of Chinese parsing research [J].Information technology, 2012, 07:72-74+78.

Feng Z W. The history and current situation of natural language processing [J]. China foreign languages, 2008, 5 (1): 14-22.

Wang T, Mai F J, Liu Z. Research on natural language processing and its application prospects [J]. Journal of Guilin Institute of aerospace technology, 2006, 11 (4): 19-21.

Zong C Q Statistical natural language processing [M]. Beijing: Tsinghua University Press, 2013:486719726.

Niklaus C,Cetto M,Freitas,André,et al.A Survey on Open Information Extraction[J].2018.

Grishman RSundhei m B.message Understanding Conference on Computational Linguistics COLING—961996—08.

S∙Soderland.Learning Information Extraction Rules for Semistructured and Free Text.Machine Learning 1999.

Liu S Q, Zhu M, Tan X B A web page semantic block mining algorithm based on tree matching [J]. Small microcomputer system, 2009, 30 (8): 1541-1545.

Han Z M, Li W Z, Mo Q Research on effective HTML text information extraction methods [J]. Research on computer application, 2008, 25 (12): 3568-3574.

Zhang S X Research on key technologies in information extraction [D]. Beijing: Beijing University of Posts and telecommunications, 2007.

Valter C, Giansalvatore M, Grammars have exceptions, Inf.Syst.23(8)(1998)539–565.

Joachim H, Jason M H, Hector G M, Semistructured data:the TSIMMIS experience, in:Advances in Databases and Information Systems, 1997, pp.1-8.

Arnaud S, Fabien A, Building intelligent web applications using lightweight wrappers, Data Knowl.Eng.36(3)(2001)283-316.

Deng C, Shipeng Y, Ji-Rong W, et al.VIPS:a vi-sion-based page segmentation algorithm[R].USA:Microsoft Technical Report, 2003.

Downloads

Published

10-11-2022

How to Cite

Liu, Y. (2022). Researches Advanced in the Development and Application of Information Extraction. Highlights in Science, Engineering and Technology, 16, 198-206. https://doi.org/10.54097/hset.v16i.2501