Corpus details

A Standard Corpus of Present-Day Edited American English

Official name:A Standard Corpus of Present-Day Edited American English
Common name:Brown Corpus
Language:English (American English)
Language type:written
Corpus type:general / reference
Size:1.000.000 words
Description:The Brown Corpus consists of 1,014,312 words of running text of edited English prose printed in the United States during the calendar year 1961. The corpus is divided into 500 samples of 2000+ words each. The samples represent a wide range of styles and varieties of prose, e.g. press, religion, skills and hobbies, popular lore, learned prose and fiction.
Exploration:The corpus consists of plain text files and can be explored with standard exploration software like WordSmith and Windows Grep.
Annotation:part of speech; inflection
Fragmentation:text fragments
Example:SA01:1 the_AT Fulton_NP County_NN Grand_JJ Jury_NN said_VBD Friday_NR an_AT investigation_NN of_IN Atlanta's_NP$ recent_JJ primary_NN election_NN produced_VBD no_AT evidence_NN that_CS any_DTI irregularities_NNS took_VBD place_NN ._.
Location:Facullty network, folder G:\LET\Data\Corpora\Engels\Brown
Details:A number of English language corpora are comparable in size, structure and constituency, thus enabling interdialectal comparisons on a comparable range of printed genres or study of language change:
Brown CorpusAmerican English1961
LOB CorpusBritish English1961
Kolhapur CorpusIndian English1978
ACE CorpusAustralian English1986
Wellington CorpusNew Zealand English    1986-1990
Freiburg Brown Corpus    American Englishearly 1990s
Freiburg LOB CorpusBritish Englishearly 1990s
See Also: 
Name:Brown Corpus Manual
Description:Information manual to the Brown Corpus, including a list of grammatical tags and their meaning.

back to overview