A Comparison of Knowledge Extraction Tools for the Semantic Web

1 minute read

Gangemi, A., 2013. A Comparison of Knowledge Extraction Tools for the Semantic Web. In P. Cimiano et al., eds. The Semantic Web: Semantics and Big Data. Lecture Notes in Computer Science. Springer Berlin Heidelberg, pp. 351—366.

Summary

This article compares a selection of natural language processing (NLP) tools for the following basic natural language processing tasks (in contrast to advanced tasks such as question answering, retrieval, etc.):

  1. topic extraction
  2. named entity recognition - i.e. determine entity types such as person, organization, etc.
  3. named entity resolution (or linking) - i.e. provide a reference to the individual mentioned
  4. named entity coreference
  5. terminology extraction (typically for classes and or properties)
  6. sense tagging
  7. sense disambiguation
  8. taxonomy induction
  9. relation extraction
  10. semantic role labeling (property induction for events and n-ary relations)
  11. event detection
  12. frame detection

Evaluation

The authors provide a comparison of the tools' performance (precision, recall, accuracy, f1) on the NLP tasks mentioned above. These measures are determined based on a single gold standard evaluation document. The gold standard entries have been constructed by merging and cleaning the results of multiple tools - an approach that has been inspired by information retrieval with incomplete information [1].

Further Literature

1. Buckley, C. & Voorhees, E.M., 2004. Retrieval evaluation with incomplete information. In Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval. SIGIR ™04. New York, NY, USA: ACM, pp. 25—32.