What common words are excluded from the Word Cloud?

The Word Cloud visualization shows 100 most frequently used words in your dataset, excluding common/functional words such as ‘a’, ‘to’, ‘the’ (also known as ‘stop words).

To exclude common words, Communalytic relies on a combined dictionary of 6,395 stop words (+ the word ‘RT’) from 15 different languages:

  • Arabic
  • Bulgarian
  • Catalan
  • Czech
  • Danish
  • Dutch
  • English
  • Finnish
  • French
  • German
  • Hungarian
  • Indonesian
  • Italian
  • Norwegian
  • Polish
  • Portuguese
  • Romanian
  • Russian
  • Spanish
  • Swedish
  • Turkish
  • Ukrainian

The stop words have been compiled and provided by Python Stop Words library.

As of Nov. 25, 2022, we’ve added a list of stop words in Persian.