Skip to Main Content

Accessing Digital Archives: Tweet Collections

Need Help?

If you are experiencing issues accessing digital files from the Special Collections and would like assistance, please contact our Research and Instructional Services team.

X (formerly Twitter) API changes

Access to the API for X (formerly Twitter) has changed. The information and guidance on this page was most accurate from 2017 - 2020. You may still be able to use some of the following guidance to access data from Tweet IDs if you use the free developer account option or purchase access to the API. https://developer.twitter.com/en/products/twitter-api

About Archived Tweet Collections

University of North Carolina at Chapel Hill Confederate Monument Protests Collected Tweets (2017-2019)

We have used a variety of tools and strategies to document the ongoing actions and discussions around the UNC-Chapel HiIl Confederate Monument, known as "Silent Sam." In recent years, social media has facilitated new approaches for sharing information and sparking action on campus. In an effort to document this aspect of the protests, the University Archives collected a sampling of tweets containing relevant hashtags such as #silencesam and #silentsam. For more information on the collection, visit the finding aid.

Image of wordcloud created from tweet data about UNC-CH Confederate Monument Protests.

Archiving Twitter Data

University Archives uses a tool called twarc to harvest tweet data for specific hashtags searches. Twarc is a Python package that makes use of the Twitter API to collect tweets. Twarc was developed as part of DocNow, a collaborative effort between Shift Design, Inc., the University of Maryland, and the University of Virginia, with funding from the Andrew W. Mellon Foundation.

Accessing Collected Tweets

Tweet collections have specific access stipulations due to the Twitter API terms of service. UNC Libraries cannot make the full data we collected available for use. In particular, we are unable to make deleted tweets available for use. Instead, we provide a list of the tweet identifiers (tweet ids) for all the tweets we’ve collected in our repository. Guidance on accessing collected tweets can be found in the box below.

How to Access Archived Tweets

Accessing and Analyzing Tweets

Twitter hashtags in this collection were acquired as a datasets not as webpages. This means that the data in this collection will not have the look and feel of Twitter.com. The tweet text and metadata about tweets will be available for data analysis. Some manipulation of the data will be required for reading or interpreting tweet text easily. 

There are a variety of methods that can help with data analysis. We have highlighted a few options in this guide, but we encourage you to explore digital methods in your field of study and other opportunities to learn data analysis skills to get the most from this collection. 

UNC-Chapel Hill Library Data Services provides a variety of digital scholarship services and trainings. They may be able to provide additional guidance. If you are not affiliated with UNC, check with your local library to see if they offer any digital scholarship services. 

To use this data you will need to:

In 2023, Access to the X (formerly Twitter) API has changed. This guidance may not be as accurate anymore.

  • Have a Twitter account
  • Download Tweet identifier lists
  • Hydrate the identifier lists 
  • Work with the data in either JSON format or spreadsheet format 
  • Explore data analysis tools that can help you interpret the data

Quick Start

  1. Access the collection by visiting the finding aid for the UNC-Chapel Hill Confederate Monument Protests Collected Tweets, 2017-2018 to learn more about how the collection was created and is organized.
  2. Download the dataset from the Digital Collections Repository (DCR)
  3. "Hydrate" the Twitter ID dataset with Hydrator
    • You will need a Twitter account to use this tool.
    • Hydrator is a downloadable desktop app for hydrating Twitter ID datasets.
      • For step-by-step guidance on installing, configuring, and using Hydrator, check out this tutorial created by University of Virginia Libraries.
    • After following the tutorial steps to install and use Hydrator, save a copy of the tweet data in a spreadsheet (.csv format). Each line in the spreadsheet represents a tweet including metadata about the tweet such as number of retweets or a timestamp. 
  4. Want to do more analysis? Wondering how to use the JSON?
    • In the box to the right you will find a more detailed guide to accessing and using this collection. 

Using Archived Tweets

For more information on ethics and good practices for using Twitter data in research, see the following resources:

Other twarc and social media archive resources:

Resources 

Download our Detailed Guide