Tutorial: How to inspect conflicting polarity scores between TextBlob and VADER in Excel/Google Sheet

To determine which of the two sentiment analysis algorithms (TextBlob or VADER) is more accurate at detecting polarity of posts in your dataset, we suggest examining all or a sample of the polarity scores produced by both libraries to cross-validate results. 

This tutorial provides a step-by-step instruction on how to use Excel or Google Sheet  to locate and manually review cases where TextBlob and VADER disagree. 

Step 1: Download the dataset with the polarity scores 

After Communalytic has completed conducting a sentiment analysis on your dataset, navigate  to the “Download Dataset” section and then click on the “Download CSV File” button. 

Step 2:  Reviewing the polarity scores

Once the CSV file has been downloaded, open it with either Excel or Google Sheets. The CSV file will include different scores generated by the two sentiment analysis libraries: TextBlob and VADER.

For each post, the CSV file will contain the two polarity scores with values between -1 and +1: ‘textblob_polarity’ and ‘vader_sentiment_compound’ (see the screenshot below):

  • Polarity scores close to 0 (usually between -0.05 and 0.05) represent neutral sentiments.
  • Negative polarity scores (-0.05 or lower) represent negative sentiments. 
  • Positive polarity scores (0.05 or above) represent positive sentiments.
  • Posts written in a non-supported language will have “N/A” under the sentiment analysis-related columns.

In addition to the two polarity scores, the file will contain three columns representing additional values produced by VADER: ‘vader_sentiment_negative’, ‘vader_sentiment_neutral’, and ‘vader_sentiment_positive’. They represent ratios of how many lexical terms in each English language post are classified by the VADER algorithm as having either a  negative, neutral or positive sentiment. The three ratios will add up to 1 or 100%.

For instance: 

Good to Know: As per the VADER’s documentation, while the normalized and weighted composite score ‘vader_sentiment_compound’ would be sufficient in most use cases, the three sentiment ratios can be useful for the analysis of “the context and presentation of how sentiment is conveyed or embedded in rhetoric”. For example, when analyzing news coverage on a given topic, a researcher may check if the sentiment ratios for each news story “are balanced with similar amounts of positively and negatively framed text versus being “biased” towards one polarity or the other.” 

However, as also noted in the documentation, the main limitation with the sentiment ratios is that they “do not account for the VADER rule-based enhancements such as word-order sensitivity for sentiment-laden multi-word phrases, degree modifiers, word-shape amplifiers, punctuation amplifiers, negation polarity switches, or contrastive conjunction sensitivity”.

Step 3: Select posts in English

Since VADER can only analyze posts in English, you can exclude non-English posts by using the Filter option in Excel as shown below. (Google Sheets has a similar filtering feature.) 

Step 4: Remove or hide duplicate posts

Prior to comparing the polarity scores, we recommend removing or hiding duplicate posts such as retweets. This may help to reduce the number of posts that need to be reviewed manually. 

  1. To remove/hide retweets in a Twitter dataset, use the Filter feature in Excel/Google Sheets and uncheck the “retweeted” value under the “referenced_tweet_type” column. 
  1. To exclude duplicate posts in any dataset that is not from Twitter, use the Remove Duplicates feature in Excel/Google Sheets as shown in the screenshot below. 

Step 5: Identify posts where VADER and TextBlob assigned the opposite polarity scores

Use the Filter feature in Excel/Google Sheets to select and review posts where VADER assigned positive polarity scores and TextBlob assigned negative ones. Specifically, select rows with the ‘vader_sentiment_compound’ score greater than or equal to 0.05, and the ‘textblob_polarity’ score less than or equal to -0.05.

After applying the two filters as discussed above, review the posts (under the ‘text’ column) that remained visible in the table (see Step 6 & 7 below for more details about this process). 

  • Next, to select and review posts where VADER assigned VADER assigned negative polarity scores and TextBlob assigned positive, change the Filter settings as follows: ‘vader_sentiment_compound’ less than or equal to -0.05 and ‘textblob_polarity’ – greater than or equal to 0.05.

Step 6: (Optional) Apply the Conditional Formatting in Excel to help highlight negative and positive scores for a visual examination of the data.

Step 7: Read and adjudicate posts where the two algorithms disagree

The following table shows three sample tweets where TextBlob assigned negative polarity scores and VADER assigned positive polarity scores. To adjudicate these cases where TextBlob and VADER disagree, ask human reviewers to read each post and decide which of the two algorithms produced a more accurate result. For example,

  1. After reading the first post, a human reviewer may agree with the TextBlob’s evaluation since it contains a cursing word (“morons”) and sarcasm. 
  2. The second tweet is a special case since it contains both slightly positive (“Look at other countries. They’re achieving what you think is impossible in Canada.”) and negative undertones (“We will be dinosaurs using fossil fuels while the world passes us by.”), but the overall sentiment seems to be more negative than positive. 
  3. The third tweet appears to be in the positive territory, so a reviewer may agree with the VADER’s assessment here.