Automatic Creation of a Reference Corpus for Political Opinion Mining in User-Generated Content

less than 1 minute read

by Sarmento et al. (TSA 2009)

This article presents an approach which applies manually-crafted lexico-syntactic patterns to collect highly opinionated commends posted by users in a Portuguese online newpaper. Afterwards these comments are propagated to the other sentences in these comments.

The authors evaluate the initial corpus (sentences matching the lexico-syntactic patterns) by classifying them according to six different error categories. Afterwards they evaluate the accuracy of the extended corpus (including the other sentences in the reviews) according to a five category schema.

The evaluations show that propagating negative opinions is less problematic than the propagation of positive opinions since there is the danger of

  • polarity inversion (the commend starts slightly positive but is followed by criticism)
  • irony
This is also reflected in the method's performance: the precision for negative opinions yields 90% in contrast to 60% for positive opinions.