Corpus details

Corpus of Contemporary American English

Official name:Corpus of Contemporary American English
Common name:COCA
Language:English (American English)
Language type:written; spoken
Corpus type:general / reference
Period:1990 - 2011
Size:425 Million words (not fixed)
Description:The Corpus of Contemporary American English is the largest freely-available corpus of English, and the only large and balanced corpus of American English. It contains more than 425 million words of text and is equally divided among spoken, fiction, popular magazines, newspapers, and academic texts. It includes 20 million words each year from 1990-2011 and the corpus is also updated once or twice a year (the most recent texts are from March 2011). Because of its design, it is perhaps the only corpus of English that is suitable for looking at current, ongoing changes in the language.
For a more detailed description, see Mark Davies (2010). The Corpus of Contemporary American English as the first reliable monitor corpus of English. Literary and Linguistic Computing 25 (4): 447-65. .
Exploration:Online search interface
Annotation:part of speech; inflection; lemma
Origin:Mark Davies, Brigham Young University (USA).
Reference:Davies, Mark. (2008-) The Corpus of Contemporary American English: 425 million words, 1990-present. Available online at
Details:After a number of queries, (free) registration is required.
Contact:Eric Akkerman
See Also: 
Name:Word and Phrase
Description:This site is based on corpus data from COCA and gives information about frequencies, collocates and synonyms of words and phrases.

back to overview