Package “SentimentAnalysis” released on CRAN

Authors: Stefan Feuerriegel, Nicolas Pröllochs

This report introduces sentiment analysis in R and shows how to use our package “SentimentAnalysis”.

What is sentiment analysis?

Sentiment analysis is a research branch located at the heart of natural language processing (NLP), computational linguistics and text mining. It refers to any measures by which subjective information is extracted from textual documents. In other words, it extracts the polarity of the expressed opinion in a range spanning from positive to negative. As a result, one may also refer to sentiment analysis as opinion mining.

What is the the package “SentimentAnalysis” about?

Our package “SentimentAnalysis” performs a sentiment analysis of textual contents in R. This implementation utilizes various existing dictionaries, such as QDAP or Loughran-McDonald. Furthermore, it can also create customized dictionaries. The latter uses LASSO regularization as a statistical approach to select relevant terms based on an exogenous response variable.

How to use the package “SentimentAnalysis”?

This simple example shows how to perform a sentiment analysis of a single string. The result is a two-level factor with levels “positive” and “negative.”

# Analyze a single string to obtain a binary response (positive / negative) 
sentiment <- analyzeSentiment("Yeah, this was a great soccer game of the German team!")
#> [1] positive
#> Levels: negative positive

The following shows a larger example:

# Create a vector of strings
documents <- c("Wow, I really like the new light sabers!",
               "That book was excellent.",
               "R is a fantastic language.",
               "The service in this restaurant was miserable.",
               "This is neither positive or negative.",
               "The waiter forget about my a dessert -- what a poor service!")

# Analyze sentiment
sentiment <- analyzeSentiment(documents)

# Extract dictionary-based sentiment according to the Harvard-IV dictionary
#> [1] 0.3333333 0.5000000 0.5000000 -0.6666667 0.0000000 -0.6000000

# View sentiment direction (i.e. positive, neutral and negative)
#> [1] positive positive positive negative neutral negative
#> Levels: negative neutral positive

response <- c(+1, +1, +1, -1, 0, -1)

compareToResponse(sentiment, response)

For more information, check out the following resources.