Tutorial: Toxicity Analysis

This tutorial will demonstrate how to run the toxicity analysis on the data that you have collected.

Step 1

Once a data collection has been completed, you’ll be able to access the dataset by clicking on your dataset name underneath the “Dataset Name” tab on your homepage.

Step 2

On the top left side, you should see an overview of the dataset you’ve selected. If you scroll further down the page, basic visualizations of the dataset are also provided.

Now, let’s click on the “Toxicity Analysis” button on the left to begin your toxicity analysis.

Step 3

To start the toxicity analysis, click on the ‘Start Analysis’ button. 

The Perspective can currently analyze posts in the following languages: Arabic, Chinese, Czech, Dutch, English, French, German, Hindi, Hinglish, Indonesian, Italian, Japanese, Korean, Polish, Portuguese, Russian, Spanish. Posts that are not written in one of the supported languages will be skipped.  

Please note that an API key from Google Perspective Service is required to start the toxicity analysis. If you do not have a Google Perspective API key, you can review our Google Perspective API Key Tutorial on how to obtain one.

If you have a new API key, enter your API key in the “My Profile” section as demonstrated in our previous tutorial. Then, return to this page. If you would like to be notified when your toxicity analysis is complete, select “Email me once job is complete”. Then click “Start Analysis” to initiate the Toxicity Analysis Process.

Step 4

From here, you are able to track the progress of your toxicity analysis. Google Perspectives API has a rate limit and can only process 100 queries per 100 seconds. Because of this, larger datasets will take longer to process. For example, a dataset with 1000 records will take approximately 20 minutes to process. 

Step 5

Once the toxicity analysis is complete, look for the blue button under the Toxicity Analysis column on your home page. Click on it to see your results. 

Step 6

This first table provides a summary of your toxicity analysis results. Each row represents different types of toxicity such as “toxicity”, identity attacks, insult, profanity or sexually explicit. “Toxicity” is a general model that considers all instances of toxicity. The toxicity values are calculated by Google’s Perspective algorithm.

Step 7

You can click to see the top 10 posts with the highest overall value for each of the toxicity types.

For instance, clicking on the highest value of “Toxicity” will show you the top 10 posts with the highest toxicity values in your dataset. The same applies to the 10 lowest values.

Step 8

If you return back to your toxicity analysis, you’ll notice the lower portion is a list of interactive charts visualizing your results. For example, the first tab provides a visualization of the distribution of toxicity values found within your dataset. This chart is also interactive, allowing you to customize it to your own liking.

Now it’s time for you to explore the Toxicity Analysis tool for yourself! You can find out more about the Perspective API here.`