Relation Extraction and the Influence of Automatic Named-Entity Recognition

2 minute read

Giuliano, C., Lavelli, A. & Romano, L., 2007. Relation extraction and the influence of automatic named-entity recognition. ACM Transactions on Speech and Language Processing, 5(1), pp.2:1—2:26.

Introduction

Relation extraction aims at identifying (directed) (binary) relations $$R_{ij} = (E_i, E_j) := R_{ji} = (E_j, E_i)$$ in text documents. This article introduces an approach that uses kernel functions to integrate information from (i) the sentence where the relation appears, and (ii) the local context around the interacting entities.

Method

The authors treat relation extraction as a classification task that distinguishes the following classes:

  1. correct: locatedIn(Chur, Switzerland) -> 2
  2. correct, but incorrect direction: locatedIn(Switzerland, Chur)-> 1
  3. incorrect: locatedIn(Chur, St. Gallen) -> 0
  4. wrong entity types: locatedIn(Christian Toth, Chur) -> -1
The method uses data from two different kinds of kernel functions (see below) for assigning sentences to one of the four classes. These kernels are combined to a shallow linguistic kernel ($$K_{SL}$$) using the linear combination

\[ K_{SL}(R_1,R_2) = K_{GC}(R_1,R_2) + K_{LC}(R_1,R_2). \]

The method was implemented using the LibSVM package.

Global Context Kernel

Bunesco and Mooney (2005) observe that relations between entities are usually expressed in one of the following contexts:

  1. Fore-Between (FB): e.g. "the head of [org], Dr [per]"
  2. Between (B): e.g. "[org] spokesman [per]"
  3. Between-After (BA): e.g. "[per], a [org] law professor"
The context is represented as an unordered bag-of-words that contains the number of times a particular token (and the corresponding n-grams) $$t_i$$ is used in $$C$$ yielding the global context kernel ($$K_{GC}$$):

\[ K_{GC}(R_1, R_2) = K_{FB}(R_1, R_2) + K_{B}(R_1, R_2) + K_{BA}(R_1, R_2)\]

Local Context Kernel

The local context often provides clues for (i) the presence of a relation and (ii) its direction. The authors represent each local context by using the following basic features considering the ordering of tokens

  1. Token
  2. Lemma of the token
  3. POS-Tag of the token
  4. Stem of the token
  5. Orthographic, a function that maps tokens into equivalence classes such as capitalization, punctuation and numerals.
The local kernel therefore amounts to

\[ K_{LC}(R_1, R_2) = K_{left}(R_1, R_2) + K_{right}(R_1, R_2)\]

Evaluation

The authors performed a 5-fold cross-validation with the dataset used by Roth and Yin (2007) that is based on the TREC 2004 corpus considering the following relation types: locatedIn, workFor, orgBasedIn, liveIn, kill, and noRel yielding

  1. F1 values between 71 and 82% for gold-standard named entities (=> all named entities are known), and
  2. F1 values between 69 and 81% without the gold-standard named entities. The evaluation also discusses the impact of named entities introduced by an incorrect NER (spurious named entities) and of missing named entities.

Bibliography

  1. Bunescu, R.C. & Mooney, R.J., 2005. Subsequence Kernels for Relation Extraction. In 19th Conference on Neural Information Processing Systems (NIPS™05). Vancouveer, British Columbia, Canada.
  2. Roth, D. & Yih, W., 2004. A Linear Programming Formulation for Global Inference in Natural Language Tasks. In H. T. Ng & E. Riloff, eds. 8th Conference on Computational Natural Language Learning (CoNLL 2004). Association for Computational Linguistics, pp. 1—8.