Tutorial: Data Collection from Reddit

In this tutorial, we will show you how to collect data using Communalytic. The procedure for the EDU and PRO versions are similar.

Step 1

If you know what subreddit you would like to examine, proceed to Step 4 below.

Finding a subreddit on your my datasets homepage. What’s a Subreddit you may ask? Subreddits are online groups/forums on Reddit dedicated to a specific topic.

To begin finding a subreddit, click “Search for relevant subreddits” under “Step 1: Finding a subreddit”. This takes us to the subreddit search page.

Step 2

Here we can locate subreddits that discuss a given topic by using the “Keyword” search bar. It is also important to note that when searching for a subreddit, a space between words will be counted as AND. If you would like to search for two keywords separately, use “|” to separate keywords.

After typing in your keywords, click the “Search” button.

Step 3

This page shows all the corresponding active subreddits with our keyword “politics”. It also includes the number of submissions in the past 7 days within these subreddits. The subreddit you searched for should be visible here. In my case, it is “politics”. Click “Start Collection on..” (corresponding subreddit) to begin Step 2.

Step 4

Before starting your data collection, you must name your dataset and select the time range of data collection. Communalytic will only collect data between the time range you specify. On Communalytic EDU, You can collect data for up to 7 consecutive days. The default start date is today and cannot be set to previous times or dates.

Data collection time will vary by subreddits, check the box “Email me once job completes”  to receive an email notification. Please note posts in “high volume” groups such as r/all may be dropped due to the Reddit API limitation.

As a final step on this page, click the “Start Collection” button.

Step 5

To confirm that data collection is underway, you should be able to see your new dataset listed on the “My Datasets” page.

When your data collection is complete, it will say “Complete” under Status, as pictured above.