# Sentiment Analysis and Text Mining

September 28, 2017

Sentiment analysis over Twitter offer organizations a fast and effective way to monitor the publics’ feelings towards their brand, business, directors, etc. A wide range of features and methods for training sentiment classifiers for Twitter datasets have been researched in recent years with varying results.

# Sentiment analysis overview

The emergence of social media has given web users a venue for expressing and sharing their thoughts and opinions on all kinds of topics and events. Twitter, with nearly 600 million users1 and over 250 million messages per day, has quickly become a gold mine for organizations to monitor their reputation and brands by extracting and analyzing the sentiment of the Tweets posted by the public about them, their markets, and competitors.

Sentiment analysis has been first introduced by Liu, B. It is also known as opinion mining and subjectivity analysis is the process to determine the attitude or polarity of opinions or reviews written by humans to rate products or services. Sentiment analysis can be applied on any textual form of opinions such as blogs, reviews and Microblogs. Microblogs are those small text messages such as tweets, a short message that cannot exceed 149 characters. These microblogs are easier than other forms of opinions for sentiment analysis. Sentiment analysis can be done on a document level or a sentence level. In the first case, the whole document is evaluated to determine the opinion polarity, where, the features describing the product/service should be extracted first. Whereas, the second one, the document is divided into sentences each one is evaluated separately to determine the opinion polarity.

## Sentiment analysis Significance

Sentiment analysis can be defined as a process that automates mining of attitudes, opinions, views and emotions from text, speech, tweets and database sources through Natural Language Processing (NLP). Sentiment analysis involves classifying opinions in text into categories like “positive” or “negative” or “neutral”. It’s also referred as subjectivity analysis, opinion mining, and appraisal extraction. The words opinion, sentiment, view and belief are used interchangeably but there are differences between them.

• Opinion: A conclusion open to dispute (because different experts have different opinions)
• View: subjective opinion
• Belief: deliberate acceptance and intellectual assent
• Sentiment: opinion representing one’s feelings

Sentiment Analysis is a term that includes many tasks such as sentiment extraction, sentiment classification, and subjectivity classification, summarization of opinions or opinion spam detection, among others. It aims to analyze people’s sentiments, attitudes, opinions emotions, etc. towards elements such as, products, individuals, topics, organizations, and services.

Mathematically we can represent an opinion as a quintuple (o, f, so, h, t), where

o = object;

f = feature of the object o;

so= orientation or polarity of the opinion on feature f of object o;

h = opinion holder;

t = time when the opinion is expressed

Object: An entity which can be a, person, event, product, organization, or topic

Feature: An attribute (or a part) of the object with respect to which evaluation is made.

Opinion orientation or polarity: The orientation of an opinion on a feature f represent whether the opinion is positive, negative or neutral.

Opinion holder: The holder of an opinion is the person or organization or an entity that expresses the opinion.

### Challenges for Sentiment Analysis

Sentiment analysis classifies text as positive, negative or else objective, so it can be thought as text classification task. Text classification has many classes as there are many topics but sentiment analysis has only three classes. However, there are many factors that make sentiment analysis difficult compared to traditional text classification. The following are some of the factors.

• Coreference Resolution: Coreference resolution is the problem of identifying what a pronoun, or a noun phrase refers to. For example, “We watched the movie and went to dinner; it was awful.” What does “It” refer to? Coreference resolution may be useful for the topic/aspect based sentiment analysis. Coreference resolution may improve the accuracy of opinion mining
• Temporal Relations: The time of reviews may be important for sentiment analysis. The reviewer may think that Windows Vista is good in 2008, but now he may have negative opinion in 2009 because of new Windows 7.So assessing this kind of opinions that are changed with time may improve the performance of the sentiment analysis system. This helps us to observe if a certain product gets improved with time, or people change their opinion about a product.
• Sarcastic sentences: Text may have Sarcastic and ironic sentences. For example, “What a great car, it stopped working in the second day.” In such case, positive words can have negative sense of meaning. Sarcastic or ironic sentences can be hard to identify which can lead to erroneous opinion mining.
• Domain Considerations: The accuracy of sentiment classification can be influenced by the domain of the items to which it is applied. The reason is that the there are many words whose meaning changes from domain to domain. For example, “Go read the book.” This sentence has positive sentiment in book domain while it indicates negative sentiment for movie domain.

References

[1] Liu, B., “Sentiment analysis and subjectivity, Handbook of natural language processing”, Volume 2, pp. 627-666, 2010

[2] Vishal A. Kharde and S.S. Sonawane, “Sentiment Analysis of Twitter Data: A Survey of Techniques”, International Journal of Computer Applications (IJCA), Volume 139 – No.11, April 2016.

[3] Mr. Saifee Vohra and Prof. Jay Teraiya, “Applications and Challenges for Sentiment Analysis: A Survey”, International Journal of Engineering Research & Technology (IJERT), Volume 2 Issue 2, February- 2013

## One Comment

• Willow Crafts November 22, 2017 at 8:33 am

I like this site because so much useful stuff on here : D.

Insert math as
$${}$$