A New Open Information Extraction System Using Sentence Difficulty Estimation

Authors

  • Vahideh Reshadat Miyaneh Technical and Engineering Faculty, University of Tabriz, Tabriz, Iran
  • Heshaam Faili School of Electrical and Computer Engineering, College of Engineering, University of Tehran, Tehran, Iran

DOI:

https://doi.org/10.31577/cai_2019_4_986

Keywords:

Information extraction, open information extraction, relation extraction, knowledge discovery, fact extraction

Abstract

The World Wide Web has a considerable amount of information expressed using natural language. While unstructured text is often difficult for machines to understand, Open Information Extraction (OIE) is a relation-independent extraction paradigm designed to extract assertions directly from massive and heterogeneous corpora. Allocation of low-cost computational resources is a main demand for Open Relation Extraction (ORE) systems. A large number of ORE methods have been proposed recently, covering a wide range of NLP tools, from ``shallow'' (e.g., part-of-speech tagging) to ``deep'' (e.g., semantic role labeling). There is a trade-off between NLP tools depth versus efficiency (computational cost) of ORE systems. This paper describes a novel approach called Sentence Difficulty Estimator for Open Information Extraction (SDE-OIE) for automatic estimation of relation extraction difficulty by developing some difficulty classifiers. These classifiers dedicate the input sentence to an appropriate OIE extractor in order to decrease the overall computational cost. Our evaluations show that an intelligent selection of a proper depth of ORE systems has a significant improvement on the effectiveness and scalability of SDE-OIE. It avoids wasting resources and achieves almost the same performance as its constituent deep extractor in a more reasonable time.

Downloads

Download data is not yet available.

Downloads

Published

2019-12-30

How to Cite

Reshadat, V., & Faili, H. (2019). A New Open Information Extraction System Using Sentence Difficulty Estimation. Computing and Informatics, 38(4), 986–1008. https://doi.org/10.31577/cai_2019_4_986