Corpus details

ICAME corpora

Official name:ICAME corpora
Language:English (different language varieties)
Language type:written
Corpus type:dialect; general / reference
Size:each corpus consists of 1.000.000 words
Description:ICAME distributes a set of English language corporathat are comparable in size, structure and constituency, thus enabling interdialectal comparisons on a comparable range of printed genres or study of language change:
Brown CorpusAmerican English1961
LOB CorpusBritish English1961
Kolhapur CorpusIndian English1978
ACE CorpusAustralian English1986
Wellington CorpusNew Zealand English    1986-1990
Freiburg Brown Corpus    American Englishearly 1990s
Freiburg LOB CorpusBritish Englishearly 1990s
The Brown Corpus and the LOB Corpus have separate entries in this overview (their names serve as links).
Exploration:The corpora consist of plain text files and can be explored with standard exploration software like WordSmith and Windows Grep.
Location:Faculty network, folder G:\LET\Data\Corpora\English
Details:Only the Wellington corpus is annotated for part of speech; the other coproa are not annotated. In some of these corpora, sentences are presented on more than one line, each of which with its own line number. This may influence the result of searches for word combinations.
Manuals to these corpora can be found on the faculty network, in the folder G:\LET\Data\Corpora\Engels\MANUALS.
See Also: 
Name:ICAME Corpus Manuals
Description:Access to the manuals of all ICAME corpora

back to overview