Modern approaches to the mechanisms of causal relationships extraction from unstructured natural language texts

Irina Bazhenova, Tokhir Umarov

Abstract


The article is devoted to the study of methods of extracting causal relationships from unstructured text in a natural language. Automatic extraction of causal relationships from natural language texts is a complex open problem in artificial intelligence. The ways of expressing explicit causal relationships used in information extraction technologies are described. The article provides a brief overview of modern data mining support tools that allow classifying data, using statistical analysis methods, clustering and segmentation tools, using visualization tools, as well as text analysis and information retrieval packages.The authors of the article present the results of developing a method for extracting cause-and-effect relationships based on the existing IBM Watson and StanfordCoreNLP cognitive services using the Natural Language Understanding, StanfordParse, Natural Language Classifier and Stanford Classifier services. The article describes a software package designed to investigate various methods for detecting causal connections in natural language texts. The software toolkit included in this complex is implemented in IBM Bluemix cloud infrastructure and provides a set of services that allow you to extract and classify relationships in unstructured text, test the analyzed methods, use administrative tools for working with services and data.


Full Text:

PDF (Russian)

References


V. A. Dyuk, A. V. Flegontov, I. K. Fomina. Primeneniye tekhnologiy intellektual'nogo analiza dannykh v yestestvennonauchnykh, tekhnicheskikh i gumanitarnykh oblastyakh.

Tekhnologii Text mining i Web mining [Elektronnyy resurs]. URL: https://nauchforum.ru/archive/MNF_tech/4(33).pdf

C. S. Khoo, J. Kornfilt, R. N. Oddy, and S. H. Myaeng. Automatic extraction of cause-effect information from newspaper text without knowledge-based inferencing.

E. J. M. Ackerman. Extracting a causal network of news topics.

K. Radinsky, S. Davidovich, and S. Markovitch. Learning causality for news events prediction.

DARPA Big Mechanism [Электронный ресурс]. URL: http://www.darpa.mil/program/big-mechanism

A. Rzhetsky. The Big Mechanism Program: Changing How Science Is Done.

J. Best. IBM Watson: The Inside Story Of How The Jeopardy-Winning Supercomputer Was Born, And What It Wants To Do Next

The DeepQA Project [Электронный ресурс]. URL: https://www.research.ibm.com/deepqa/deepqa.shtml

S. Soderland. Learning information extraction rules for semi-structured and free text.

F. Ciravegna. (LP)2, an Adaptive Algorithm for Information Extraction from Web-related Texts.

M. E. Cali. Relational Learning Techniques for Natural Language Information Extraction.

K. V. Vorontsov. Lektsii po metodu opornykh vektorov

k-nearest neighbors algorithm [Электронный ресурс]. URL: https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm

Naïve Bayes Classifier [Электронный ресурс]. URL: https://en.wikipedia.org/wiki/Naive_Bayes_classifier

R. Girju, D. Moldovan. Text Mining for Causal Relations.

B. Rink, C. Bejan, S. Harabagiu. Learning Textual Graph Patterns to Detect Causal Event Relations.

S. Bethard, W. Corvey, S. Klingenstein, J. H. Martin (2008). Building a Corpus of Temporal-Causal Structure.

A. Sorgente, G. Vettigli, F. Mele. Automatic extraction of cause-effect relations in Natural Language Text.

Apache OpenNLP [Электронный ресурс]. URL: https://opennlp.apache.org/

Apache UIMA [Электронный ресурс]. URL: https://uima.apache.org/

SAP Hana [Электронный ресурс]. URL: https://sap.com/products/hana.html

IBM Watson Services on IBM Bluemix [Электронный ресурс]. URL: https://console.ng.bluemix.net/catalog/

Polyanalyst [Электронный ресурс]. URL: http://www.megaputer.ru

Stanford CoreNLP «Core Natural Language software» [Электронный ресурс]. URL: https://stanfordnlp.github.io/CoreNLP/

R. Girju, M. Hearst, P. Nakov, V. Nastase, S. Szpakowicz, P. Turney, D. Yuret. Classification of Semantic Relations between Nominals. // SemEval 2007 task 8.


Refbacks

  • There are currently no refbacks.


Abava  Кибербезопасность MoNeTec 2024

ISSN: 2307-8162