Library Data Services caters to researchers interested in working with data, mapping, texts, visualization, and technology. Many of these services are available online. Davis Library Data Services, located on the second floor of Davis Library, offers:
When you collect your own data, citing its location makes it possible for others to find them and extend your research, raising your profile as a researcher. ICPSR provides a good overview of the importance of data citation:
"Citing data files in publications based on those data is important for several reasons:
If you're using data you didn't gather yourself, citing your source is just as important as citing your other research sources. For other scholars to be able to examine and extend your work, they must be able to find the original data.
Consequently, although most style guides do not include examples for citing data, consider the key components and other elements at right and work them into the style you're using.
Element |
Description |
Author |
The original researcher(s) who collected the data |
Study name/Title |
What did the original researcher call it? |
Producer |
The organization that sponsored the research, usually the author's institution. This takes the place of a publisher in an ordinary citation, so be prepared to list the place of publication as well. It may be useful to add a designation like [producer] if it is not actually a publisher. |
Year Data Produced |
When did the Producer first release the data? Treat this like the publication date. |
Element |
Description |
Unique Identifier, like a Digital Object Identifier (DOI) |
If you got the data from a repository like ICPSR, note their unique identifier as part of the title. If the data file has a DOI, include it as you would a URL for a web site. Check here for information on how to obtain a DOI. |
Distributor |
The organization that makes the data available. From what organization did you get it? If directly from the author, listing the author's institution/organization once (as the publisher) is sufficient. However if the distributor is different from the producer, it's important to list it separately; it may be useful to add a designation like “[distributor]” to clarify its role. |
Year Data Collected |
When did the original researcher collect the data? You may choose how specific to be--it may only be important to list the years, or you may want to provide more specific date ranges if it would be important for subsequent users to know the periodicity (months, weeks, days, etc.). |
Note that the elements provided here all refer to datasets that have been either published in some way, or deposited in a repository. It is more difficult to cite data that have not been preserved or fixed in some way.
If you plan to scrape data, FIRST CONTACT DIGITAL RESEARCH SERVICES to be sure you are not violating the legal license terms under which we operate. You will also need to explore if copyright and licensing terms allow you to preserve and/or share the data you obtain in this manner.
Once you are sure you have permission to scrape, preserve and/or share, make a plan for how to share this information with other researchers.
You may want to
If you are scraping web pages (as opposed to database content), you should cite a list of all the urls you scraped. You may also wish to make sure all scraped pages are archived by the WayBackMachine so that they continue to be accessible in the format you encountered despite later changes.
Thanks to Sebastian Karcher of the Qualitative Data Archive for much of this advice.