Sentiment Analysis of Blogs by Combining Lexical Knowledge with Text Classification

1 minute read

by Melvile et al (IBM Watson Research Centre) *****

Analyzing blog posts raises a number of interesting questions:

how to identify the subset of blogs discussing not products but high level concepts (properties?) relevant to these products.
the identification of the mot authoritative/influential sources.
detecting the sentiment expressed about an entity (product, properties, etc.)

This article describes the use of background lexical information for sentiment detection. The authors refine a given sentiment dictionary with training examples.

The described approach draws upon pooling multinomial classifiers for providing a composite Naive Bayes class that incorporates background knowledge with training examples. This is achieved by combining the probability distributions of two experts: (i) an expert trained on labelled training data and (ii) an expert representing a generative model explaining the sentiment lexicon.

There is substantial literature on combining such distributions available. The authors performed their experiments with the following approaches:

the linear opinion pool which performed best in the evaluation: ($$P=\sum_i^K \alpha_i P_i$$) the pooled probability is the sum of the expert's probabilities weighted by a factor $$\alpha_i$$ ($$\sum \alpha_i = 1$$)

logarithmic opinion pool: ($$P=\prod_i^K P_i^{\alpha_i}$$); $$\sum \alpha_i = 1$$; $$Z$$ is a normalizing constant; if $$\alpha_i = 1/K$$ this approach equals to the geometric mean of all expert opinions

The authors compute the weights ($$\alpha_i$$) based on the experts' errors in explaining the training data.

Share on

Twitter Facebook LinkedIn

Albert Weichselbraun

Sentiment Analysis of Blogs by Combining Lexical Knowledge with Text Classification

Share on

You may also enjoy

Big, Linked Geospatial Data and Its Application in Earth Observation

Employment relations: a data driven analysis of job markets using online job boards and online professional networks

Suffix array

Dynamic feature scaling for online learning of binary classifiers