Skip to main content

Linguistic Corpora: Home

Linguistic Corpora Available through University Libraries

The following corpora (formerly known as BYU Corpora) have a limited-functionality public interface and data files purchased by University Libraries. The files come in three formats: database, word/lemma/Pos, and linear text. Onyen authentication is required for download.

Corpus del Español

Corpus of Contemporary American English (COCA)

Corpus of Historical American English (COHA)

The Movie Corpus

News on the Web Corpus (NOW)

Corpus of American Soap Operas (SOAP Corpus)

TV Corpus

Wikipedia Corpus

Corpus of Global Web-Based English (GloWbE)

iWeb Corpus

Kirill Tolpygo

Profile Photo
Kirill Tolpygo
My pronouns are: he/him.
118 Davis Library, CB#3918
(919) 962-8044
  • Last Updated: Jan 6, 2021 4:18 PM
  • URL: