Frequently Asked Questions

What is Communalytic?

Communalytic is a research tool for studying online communities and online discourse. Communalytic can collect and analyze public data from various social media platforms including Reddit, Twitter, and Facebook/Instagram (via CrowdTangle). It uses advanced text and social network analysis techniques to automatically pinpoint toxic and anti-social interactions, identify influencers, map shared interests and the spread of misinformation, and detect signs of possible coordination among seemingly disparate actors.

There are two versions of Communalytic:

  • Communlaytic Edu is designed for educators and students to teach and learn about social media data analytics and social network analysis.
  • Communalytic Pro is designed for the academic research community and is ideal for large scale academic research projects. It provides researchers with the resources and infrastructure necessary for conducting independent research in the public interest.

Examples of research inquiries that Communalytic can enable:

    • What is the prevalence and the types of toxic and anti-social interactions observed in online discourse? How does the presence of toxic interactions change the network structure of an online community over time? And what is their potential impact on the conversation, the community, and its members?
    • How effective is a social media platform at combatting online toxicity and harassment? How closely are platforms’ content moderators (Human and AI) adhering to their employer’s stated public content moderation policy?
    • What types of accounts tend to share misinformation (e.g., links to questionable sources)? Are these actions likely to be coordinated in some way? And what narratives are these accounts trying to inject into an online discussion?
    • Can/how social media platforms facilitate informal learning and knowledge exchange among community members?
    • Who are the opinion leaders or influential voices in an online discussion?

Edu Version - For Teaching & Learning

FREE with a university email address

Communalytic Edu is designed for students and is ideal for teaching and learning about social media analytics. All Communalytic Edu accounts can store up to 30K records across 3 datasets and have the following platform-specific data usage caps.

Subreddit (Live Data Collector) – Communalytic Edu can collect up to 100 most recent submissions (=thread starting posts) and any new submissions (including the corresponding comments and replies to comments) from a given public subreddit for up to 7 consecutive days going forward, starting from the date when you initiated the data collection. To use this collector, you do not need to apply for a separate Reddit API key as Communalytic Edu is using a site-wide API key at this time.

Note 1: Please also note that comments to Reddit submissions and replies to comments are only collected at the end of the specified data collection period. If  a comment or a reply has been deleted by the moderator(s) or the poster prior to the end date of your data collection, it will not be included in the final dataset.

Note 2: Communalytic Edu will try to collect any new submissions within the specified data collection period; however, some posts in “high volume” groups (such as r/all) may be dropped due to the reddit API limitation.

Note 3Communalytic Edu does not collect posts from subreddits with 10 million or more subscribers like r/askReddit. If you need to collect data from subreddits with 10 million or more subscribers, please check out Communalytic Pro.

Twitter Thread – Communalytic Edu will collect the most recent public replies (up to 10K) to any public tweet posted within the previous 7 days. This data collection feature is ideal for studying recent tweets that have attracted a high level of engagement. (For example, a tweet from politicians, celebrities, news outlets, etc…). To use this collector, you will need to apply for a Twitter Developer account

Twitter Academic Research Track – N/A. This data collection feature is only available in Communalytic Pro.

CrowdTangle (Facebook/Instagram) Communalytic Edu can collect public Facebook/Instagram posts (up to 10K) that shared the same URL (ex. a URL to a single NYT story or the URL to any domain name). To use this collector, you will need to apply for academic access to Facebook’s CrowdTangle platform. 

Note 1: CrowdTangle data is not exhaustive; it only tracks public posts made by “influential” accounts. Here’s more info about the types of Facebook/Instagram accounts that CrowdTangle indexes.

Yes, with Communalytic Edu, you can run 1 Reddit, 1 Twitter and 1 CrowdTangle data collection simultaneously.

You can store ≤ 30K records shared across 3 datasets at any time in your Communalytic Edu account (i.e. per account, you can have 1 dataset with ≤ 30K records or up to 3 datasets with a variable number of records not exceeding 30K records in total).

If you’re at your account limit, you can download your previously collected datasets to free up space.

Alternatively, if your need is more robust, consider upgrading to Communalytic Pro where you can collect and store ≤ 10M records shared across ≤ 50 datasets.

Data/API access is granted solely at the discretion of the platforms. We recommend that you apply in advance to the platform(s) of your choice for API access.

Subreddit –  No. Historical data collection is only available in Communalytic Pro.

Twitter Thread – Yes, you can collect historical tweets from a Twitter thread, as long as the tweets are public and posted within the past 7 days from when you started your data collection. 

Twitter Academic Research Track: N/A. This data collection feature is only available in Communalytic Pro.

CrowdTangle (Facebook/Instagram): Yes, you can collect public historical Facebook/Instagram posts.

————————————————————

Also see Communalytic Edu FAQ: “What are the parameters for data collection?”

No. You can not use Communalytic Edu to collect data that is private such as DMs or for accounts that are set to private.

The developers of Communalytic Edu are proponents of ethical computational social science research in the public interest. All data/API access in Communalytic Edu is granted solely at the discretion of the platforms via public APIs provided by the platforms. If you are working with social media data, we encourage you to review and follow ethical guidelines and best practices established by your institution. 

As a primer, please review this excellent resource by the Association of Internet Researchers (AOIR): Ethical Decision-Making and Internet Research Recommendations

We’ll keep your datasets on our server for 100 days from the end of your collection date. 

You will receive a notification 3 weeks before the expiration date and 3 days before your dataset is automatically deleted from our system.

Yes, just download your dataset as a CSV file from the Communalytic Edu and then upload the file to your Communalytic Pro account.

Yes, you can download your datasets as a CSV file. In addition, you can also download the resulting communication or semantic network files as a GraphML file

Yes, you can upload an existing dataset (in CSV format) for analysis in Communalytic Edu 

Gruzd, A., & Mai, P. (2021). Communalytic: A Research Tool For Studying Online Communities and Online Discourse. Available at https://Communalytic.com

Note: For information on how to properly describe Communalytic Edu data collection processes, see the FAQ item on “What are the parameters for data collection?

Pro Version - For Research 

$349.00 for a 6-month subscription to support site infrastructure (server-side data collection, storage, processing, analysis and visualization)

Communalytic Pro is designed for academic researchers and is ideal for large scale academic research projects. All Communalytic Pro accounts can store up to 10M records across 50 datasets and have the following platform-specific data usage caps.

Subreddit (Historical and Live Data Collectors) – Communalytic Pro can collect available posts (including submissions, comments and replies to comments) from a given public subreddit for up to 31 consecutive days. The 31-day period can be for any 31 days in the past (aka – Hisorical) or 31 days going forward starting from the date when you initiated the data collection (aka -Live). You can repeat this process till you have the data you need for the entirety of the period you wish to study. You can also download and combine the resulting CSV files and upload the new file for analysis. To use this collector, you do not need to apply for a separate Reddit API key as Communalytic Pro is using a site-wide API key at this time.

Note 1Please note that comments to Reddit submissions and replies to comments are only collected at the end of the specified data collection period. If  a comment or a reply has been deleted by the moderator(s) or the poster prior to the end date of your data collection, it will not be included in the final dataset.

Note 2: Communalytic Pro will try to collect any new submissions within the specified data collection period; however some posts in “high volume” groups (such as r/all) may be dropped due to the reddit API limitation.

Twitter Thread – Communalytic Pro will collect the most recent public replies (up to 500K) to any public tweet posted within the previous 7 days. This data collection feature is ideal for studying recent tweets that have attracted a high level of engagement. (For example, a tweet from politicians, celebrities, news outlets, etc…). To use this collector, you will need to apply for a Twitter Developer Account.

Note 1: All standard Twitter Developer Accounts come with a monthly tweet cap usage of 500k posts as indicated by your Twitter’s Developer Dashboard

Note 2: If you are a qualified academic researcher with access to a Twitter Academic Research Track accountyou will be able to collect historical tweets and replies to tweets that are still publicly available and you will not be subjected to the 500K monthly tweet cap limit, nor to only tweets posted within the previous 7 days. 

Twitter Academic Research Track – Communalytic Pro can collect up to 10M tweets per month via Twitter’s full-archive (historical) search. To use this collector, you will need to apply for a Twitter Academic Track account (available only to qualified academic researchers.)

CrowdTangle (Facebook/Instagram) – Communalytic Pro can collect public Facebook/Instagram posts that shared the same URL (ex. a URL to a single NYT story or the URL to any domain name). To use this collector, you will need to apply for. To access this API, you will need to apply for academic access to Facebook’s CrowdTangle platform.  

Note 1: CrowdTangle data is not exhaustive, it only tracks public posts made by “influential” accounts. Here’s more info about the types of Facebook/Instagram accounts that CrowdTangle indexes.

Yes, with Communalytic Pro, you can run 2 Reddit, 1 Twitter and 1 CrowdTangle data collection simultaneously.

You can store ≤ 10M records shared across 50 datasets at any time in your Communalytic Pro account (i.e. per account, you can have one dataset with 10M records or up to 50 datasets with a variable number of records not exceeding 10M records in total).

If you’re at your account limit, you can download your previously collected datasets to free up space.

Alternatively, if you know that you are likely to exceed either the 50-dataset cap or the 10M-record cap per account, you have the option to create a second Pro account using a different email address.

Data/API access is granted solely at the discretion of the platforms. We recommend that you apply in advance to the platform(s) of your choice for API access.

Subreddit –  Yes. You can collect historical posts from any subreddit for any period in the past.

Twitter Thread – Yes, you can collect historical tweets from a Twitter thread, as long as the tweets are public and posted within the past 7 days from when you started your data collection. (If you have been granted access to the Twitter Academic Research Track, you will not be subjected to 7-day limit.)  

Twitter Academic Research Track: Yes, you can collect historical tweets via the Twitter Academic Research Track archive endpoint.

Facebook/Instagram (via CrowdTangle API): Yes, you can collect public historical Facebook/Instagram posts.

————————————————————

Also see Communalytic Pro FAQ: “What are the parameters for data collection?”

No. You can not use Communalytic Pro to collect data that is private such as DMs or for accounts that are set to private.

The developers of Communalytic Pro are proponents of ethical computational social science research in the public interest. All data/API access in Communalytic Pro is granted solely at the discretion of the platforms via public APIs provided by the platforms. If you are working with social media data, we encourage you to review and follow ethical guidelines and best practices established by your institution. 

As a primer, please review this excellent resource by the Association of Internet Researchers (AOIR): Ethical Decision-Making and Internet Research Recommendations

We’ll keep your datasets as long as your paid tier is not expired. You can extend your tier at anytime for another 6 months via the My Profile menu. 

You will receive a notification 7 days before your account’s expiration date. After your account has expired, you will have 14 days to upgrade it before your account and datasets are automatically removed from our system.

No, the file upload feature is only available in Communalytic Pro.

Yes, you can download your datasets as a CSV file

You can also download the resulting communication or semantic network files as a GraphML file

Yes, you can upload/import an existing dataset (in CSV format) for analysis in Communalytic Pro.

(NEW!) You can now also upload/import an existing Twitter dataset from multiple JSON files.

Gruzd, A., & Mai, P. (2021). Communalytic: a Research Tool For Studying Online Communities and Online Discourse. Available at https://Communalytic.com

Note: For information on how to properly describe Communalytic Pro data collection processes, see the FAQ item on “What are the parameters for data collection?”