Opinion Mining/Sentiment Mining Datasets



Opinion mining (sentiment mining, opinion/sentiment extraction) is the area of research that attempts to make automatic systems to determine human opinion from text written in natural language.

For example, a sample review of a new movie may look like this:

I found the movie to be quite boring. It's not that I am solely interested in action movies, but any script that makes "My Dinner with Andre" seem fast-paced needs to be re-worked.

While doing research in this area, I have collected and prepared a number of training review datasets from various sources. I thought it would be helpful to share my datasets for download, so others can play around with them and compare results. These datasets were used in a couple of papers I wrote with Larry Yaeger at Indiana University. If you find these datasets useful, then we ask that you please cite the first paper.

Building a General Purpose Cross-Domain Sentiment Mining Model (Whitehead and Yaeger, CSIE, 2009)
Whitehead, M. and Yaeger, L. Sentiment Mining Using Ensemble Classification Models, (SCSS), Dec. 2008



Opinion Mining/Sentiment Mining Dataset Downloads

Digital camera reviews from Amazon.com

Summer camp reviews from CampRatingz.com

Physician reviews from RateMDs.com

Pharmaceutical drug reviews from DrugRatingz.com

Laptop reviews from Amazon.com

Lawyer reviews from LawyerRatingz.com

Music (CD) reviews from Amazon.com

Radio show reviews from RadioRatingz.com

TV show reviews from TVRatingz.com