Sentiment analysis—sometimes called opinion extraction, opinion mining, sentiment mining, and subjectivity analysis—is broadly defined as the task of identifying attitudes found in a text. The simplest form of this is determining whether the attitude of a text is positive or negative, but the technique is finding increasingly complex applications.
Sentiment analysis is a form of natural language processing (NLP), a subfield of artificial intelligence. NLP is concerned with computers processing and analyzing natural language, i.e., any language that has developed naturally, rather than artificially, such as with computer coding languages. We’re running a series of blog posts covering a variety of topics in NLP. Our last post in this series explored named entity recognition (NER).
Any company can apply sentiment analysis to social media, user reviews, or customer service enquiries to learn more about brand, feature and product perception, or successes and shortcomings within the company and product. This versatility has led to sentiment analysis receiving significant attention in recent years.
Internally, sentiment analysis improves the scalability of processes by automating an otherwise laborious process. The technique also opens up real-time insights, leading to a faster, more reactive business, and provides consistent criteria by which to measure sentiment, which can otherwise be a messy and ambiguous process.
Things have come a long way since then, so let’s explore how sentiment analysis works, some high-level use cases, and how you can apply it in your business.
How does sentiment analysis work?
At its simplest, sentiment analysis is a three-step process:
- Text pre-processing
- An essential first step in any NLP task, this is the process of breaking text up into small chunks, usually words or phrases, called tokens. You will likely also apply part-of-speech tagging, stemming, lemmatization, and stop word removal. Pre-processing reduces the noise of the dataset and makes the next step easier.
- Feature extraction
- Here, we convert our tokenized text into a form that is digestible by a machine learning algorithm. The way in which we do this is heavily dependent upon what meaning we’re hoping to extract from the text. For example, you might need to decide if word occurrence is more important than word frequency. The most rudimentary approach to this is the bag-of-words model
- The processed input is categorized as, e.g., “positive”, “negative”, or “neutral”
Can I make it more complex?
The example above—using “positive”, “negative”, and “neutral” as an output—is a question of polarity, with two extremes.
We can make the task more complicated by adding granularity:
- “Rate the attitude of this text from 1–5, with 1 being extremely negative and 5 being extremely positive.”
We can further complicate things by requesting different types of analysis:
- Identify the source of the attitude
- E.g., is it a customer saying this, a journalist, a competitor, etc.?
- Identify the target of the attitude
- E.g., is the source talking about our product’s latest feature, our customer service, our website, or a recent advert?
- Identify more complex attitudes
- E.g., is the source expressing frustration, desire, regret, hatred, etc.?
Where does sentiment analysis struggle?
As with many NLP tasks, the most difficult aspect of sentiment analysis is parsing nuance.
Human language contains plenty of this:
- “I love how your customer service takes 3 days to reply!”
- Idiosyncratic phrasing
- “You’re killing me with all these new features”
- Thwarted expectations
- “I loved every minute of it, the acting was top class, and the cinematography stunning, but the whole thing was ruined by the ridiculous twist.”
- These require additional pre-processing steps to ensure they’re included in your analysis
Contextual understanding is often key to understanding subtlety, but correctly processing these types of nuances remains one of the biggest challenges within sentiment analysis, as well as NLP more generally.
The clearest way forward, as is often the case in machine learning, is to generate more relevant labeled training data. The more labeled data you train your model off, the more accurate and better at handling nuance your model is going to be.
Sentiment analysis use cases
In line with its versatility, sentiment analysis has a wide range of use cases. Anywhere that users are sharing their thoughts in text, either publicly or directly with your company, can prove to be a treasure trove for sentiment analysis.
Some high-level use case examples:
Determine the overall positivity or negativity of media, product, or service reviews in order to gain an overview of public or critical reception. Companies can use this to assess how new products or features have been received. Review amalgamators can also harness this technique, as can ecommerce and travel sites looking to make the most of user-generated content.
As well as gaining an overview of their own offerings, companies can also acquire powerful competitor insights using this method.
Measure consumer confidence, public unrest, and other metrics through social media to understand how people might react to upcoming products or news.
Analyse incoming customer queries to automatically prioritize tickets. You can also identify the targets of customer queries alongside their query sentiment to understand where your product is falling short and where it is excelling.
Understand how people feel about a political candidate or social issue based on analysis of social media posts and survey responses.
Use public sentiment analysis to inform election predictions, forecast market trends, or anticipate the reaction to breaking political or industry news.
What can I use sentiment analysis for?
If your company generates social media buzz, product reviews, customer queries, or has insight to gain from competitor analysis in these areas or from the wider reaches of public sentiment, then sentiment analysis could provide your business with a powerful tool.
To begin, you’ll need a highly accurate labeled training dataset to train your sentiment analysis algorithm off. To get that, all you have to do is provide Canotic with your raw text, tell us what sentiments you want extracted, and we’ll take care of the rest.
If you’re interested in learning more or have a specialized use case, reach out to us. You can also stay tuned to our blog, where we’re continuing to run a series of posts covering different aspects of NLP.
Keep in touch