How is sentiment analysis used in the real world?

Sentiment analysis is a subfield of natural language processing (NLP). It is used in determining and classifying emotions or attitudes within text data using NLP algorithms. It allows companies to identify customer sentiment towards products, brands, or services through a vast corpus of unstructured text data available from social media, user reviews, chatbots, forums, and news.

In a previous post, we explained the technique’s method and took a high-level look at its applications. In this post, we’re going to explore in more detail some examples of sentiment analysis in the real world, looking at how companies and organizations have successfully deployed the technique to help them achieve their goals. If you want to integrate sentiment analysis in your business, talk to our team to get started.

What makes a good sentiment analysis model?

Before we analyze the real-world use cases in more detail, let’s clarify some of the key evaluation metrics necessary to assess the effectiveness of a sentiment analysis model:

Walber / CC BY-SA (https://creativecommons.org/licenses/by-sa/4.0)
  • Accuracy: the percentage of correctly predicted observation to the total observations. It is the most intuitive performance metric. However, it is only useful when the classes in your sentiment analysis model are balanced, i.e., the number of observations in each class (e.g., positive and negative sentiment) is roughly the same. When the classes are highly imbalanced, i.e., a class is highly outnumbered, the below metrics are more useful.
  • Precision: this expresses the proportion of the observations our model identified as relevant to those that actually were relevant. It answers the question: how many of the sentiments we classified as positive are actually positive?
  • Recall: this refers to the percentage of total relevant results correctly classified by your algorithm. It expresses the ability of the model to find all relevant instances in a dataset. It answers the question: of all the positive sentiments, how many of those we correctly predicted as positive?
  • F1 score: by definition, there exists a trade-off between the precision and recall metrics. If both the precision and recall are essential for the use case, a simpler metric called the F1 score can be used. It is the harmonic mean of the precision and recall values.

Rules of thumb:

  • Accuracy works best if false positives (e.g., classifying a positive sentiment as negative) and false negatives (e.g., classifying a negative sentiment as positive) have a similar cost and distribution (or counts)
  • F1 score is best if you have an uneven class distribution
  • Maximize precision if you want to be more confident about your true positives
  • Pick recall if having more false positives is better than false negatives

AYLIEN: the news intelligence platform

Aylien provides an AI-powered text analytics API which allows businesses to generate insights from the vast amounts of world news. A few applications of their API:

  • Predicting the best picture category at the Oscars: using their API, they pulled news articles from 01/01/19 to 06/02/20 that featured a nominated movie title in its text. They divided the stories by positive and negative sentiment, excluding neutral sentiment stories. They observed spikes in media coverage across the movie release dates. Interestingly, far more attention has been given to Once Upon a Time in Hollywood and Joker in terms of the number of news stories. Unfortunately, the amount of press attention did not reflect the eventual winner, Parasite, which gathered comparatively little buzz.
  • Sentiment around the Brexit saga: Aylien used their news API to analyse the sentiment around Brexit from June 24, 2016 — the day the UK held a referendum on whether to leave or remain — and January 31, 2020 — the day the UK officially left the EU. Using a similar approach to the Oscars mentioned above, they observed that the media reaction was overwhelmingly negative. They also analysed the average sentiment regarding Brexit from stories coming from each EU country.
  • Entity-level sentiment analysis (ELSA): this accurately predicts the sentiment expressed around each entity in a body of text, even when the sentiment about each is different. It allows us to extract more granular insights, and see how every person, organization, and thing was being talked about in the media.

The following example from the Aylien research blog highlights the use of ELSA:

Consider a sentence with three different sentiments expressed about three different entities — “Jeb Bush is ok, but lyin’ Ted Cruz is the worst. He’ll never be as great as your president, Donald J Trump”.

Document-level sentiment analysis would only return a single, negative sentiment. ELSA managed to identify the entities and return a sentiment for each, showing a negative sentiment for Ted Cruz, neutral for Jeb Bush, and positive for Donald J Trump.

The team went on to perform ELSA on coverage of the Google I/O conference and tracked the sentiment around 20 different entities, illustrating a broader and more detailed application of the technique.

Learn more about the process and discoveries at Aylien research blog.

KPMG: sentiment analysis on sustainability reporting

A sustainability report is a report published by a company or organization that covers the economic, environmental, and social impacts caused by its everyday activities. A sustainability report is vital for communicating sustainability performance and impacts — whether positive or negative. One of the departments at KPMG, a top auditing company, read client’s sustainability reports to provide an opinion on whether they can be published. By Global Reporting Initiative standards, the report is required to be balanced, i.e., reflects both positive and negative aspects of a company’s performance so that stakeholders can make a good assessment of the performance.

The issue is that the report is verified only by a single person in the sustainability department, and it is a matter of their opinion whether the report is balanced or not. Thus, the task was to make this balance measurable (quantitative) for the clients by leveraging sentiment analysis algorithms.

A frequent challenge we see at Canotic is handling nuance in data labeling, and this was a problem faced by the team at KPMG, as even negative statements were worded positively:

Companies (shockingly) do not tend to use language like “useless product, waste of money” in their annual or sustainability report, but rather discuss ‘challenges’ and ‘vow to do better’.

Due to this, they couldn’t use existing sentiment analysis solutions or models, as they were trained on the wrong kind of data. Thus, they obtained 8,000 newly labeled “sustainability sentiment” sentences.

They tried the following methods for sentiment analysis with little success:

  • Commercial: Heaven on Demand, Rosetta, Text-processing.com
  • Open source: Stanford Sentiment Treebank, Textblob
  • Self-trained: TensorFlow neural network pre-trained on 50,000 movie reviews

However, all these models disagreed. And most positively worded negative sentiments were still being predicted as positive.

To overcome these problems, they needed a model that could understand the context better. Hence, they used BERT or Bidirectional Encoder Representation from Transformers. It is a pre-trained language representation model introduced by Google. More specifically, they used the BERT base (12 layers in the network) model instead of BERT large (24 layers).

They found huge improvements in the per-class (negative, positive, and neutral) classification accuracy, precision, and F score compared to the previous methods.

Learn more here.

McKinsey City Voices project

The urban-planning department of Brazil recruited McKinsey, a leading consulting firm, to develop a tool called City Voices that captures and analyses citizens’ sentiment across key aspects of city life to help leaders understand what matters most to their constituents.

“McKinsey conducted a thorough study of different citizens and journeys, identified a list of more than 150 different metrics, and then whittled them down to a key 30, which were then subjected to sentiment analysis algorithms to arrive at the insights that could underpin public policy.”

Learn more in McKinsey’s Voices on Infrastructure issue.

Do you know any creative or powerful examples of sentiment analysis in the real world or need help using sentiment analysis in your business? Reach out to us and let us know.

We’re going to continue exploring various techniques and use cases in the world of NLP over the coming weeks, so stay tuned to our blog to learn more.