Corpus details

Lancaster-Oslo/Bergen Corpus

Official name:Lancaster-Oslo/Bergen Corpus
Common name:LOB Corpus
Language:English (British English)
Language type:written
Corpus type:general / reference
Size:1.000.000 words
Description:The Lancaster-Oslo/Bergen Corpus contains 500 printed texts of about 2,000 words each, or about a million running words in all. The year of publication (1961) and the sampling principles are identical to those of the Brown Corpus, though there were necessarily some differences in text selection. The coding system differs, however, in many respects in the two corpora, the main discrepancy being the greater degree of delicacy of coding in the LOB corpus
Exploration:The corpus consists of plain text files and can be explored with standard exploration software like WordSmith and Windows Grep.
Annotation:part of speech; inflection
Fragmentation:text fragments
Example:A01{4}_p a_AT move_NN to_TO stop_VB \0Mr_NPT Gaitskell_NP from_IN nominating_VBG any_DTI more_AP labour_NN life_NN peers_NNS is_BEZ to_TO be_BE made_VBN at_IN a_AT meeting_NN of_IN labour_NN \0MPs_NPTS tomorrow_NR ._.
Location:Facullty network, folder G:\LET\Data\Corpora\Engels\LOB
Details:A number of English language corpora are comparable in size, structure and constituency, thus enabling interdialectal comparisons on a comparable range of printed genres or study of language change:
Brown CorpusAmerican English1961
LOB CorpusBritish English1961
Kolhapur CorpusIndian English1978
ACE CorpusAustralian English1986
Wellington CorpusNew Zealand English    1986-1990
Freiburg Brown Corpus    American Englishearly 1990s
Freiburg LOB CorpusBritish Englishearly 1990s
See Also: 
Name:LOB Corpus manual
Description:General information manual to the LOB Corpus
Name:Tagged LOB manual
Description:Information manual to the tagged LOB Corpus, including a list of grammatical tags and their meaning.

back to overview