Natural Language Processing Resources
Sentiment Analysis
Lexicons
- Opinion lexicons and datasets
- MPQA Subjectivity Lexicon
- Emotion lexicon
- Loughran and McDonald Financial Sentiment Dictionaries
- FrameNet - a lexical database for frame structure of selected words.
Baker, C. F., Fillmore, C. J., & Lowe, J. B. (1998). The Berkeley FrameNet Project. In Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics - Volume 1 (pp. 86–90). Stroudsburg, PA, USA: Association for Computational Linguistics.
- VerbNet - maps verbs to their corresponding Levin verb classes
Schuler, K. K. (2006). VerbNet: A Broad-Coverage, Comprehensive Verb Lexicon. University of Pennsylvania.
Datasets
- Stanford Twitter Sentiment Dataset - Hu et al. 2013
- Obama-McCain Debate (OMD) Twitter Dataset - Hu et al. 2013
- SemEval-2013 Twitter Dataset
- Sander Analytics Twitter Sentiment Dataset
- Blitzer Sentiment Dataset
- SFU Review Corpus - contains 17,263 sentences + negation/speculation cues and scopes; domain: reviews of books, cars, computers, cookware, hotels, movies, music and phones
- Multi-Domain Sentiment Dataset (version 2.0) - Amazon product reviews for books, DVD, electronics and kitchen.
Evaluations & Challenges
- SemEval 2014 - Aspect Based Sentiment Analysis
- ESWC-14 Challenge on Concept-Level Sentiment Analysis
- GermEval 2014 Named Entity Recognition Shared Task
Text Corpora
- RateMDs50,000 doctor reviews;
- Drugs-Forum.com - discussions of illicit drugs; use of text summarization to reveal information on drug use.
Controlled Vocabulary
- Consumer Health Vocabulary | Query Interface - translates between consumer and expert jargon