Corpus details

Corpus Hedendaags Nederlands

Official name:Corpus Hedendaags Nederlands
Common name:CHN
Language type:mainly written; some news broadcast transcipts
Corpus type:general / reference; genre specific
Period:1814 - 2013
Size:More than 80.000 texts
Description:The Corpus Hedendaags Nederlands in the current release is a first step towards a monitor corpus for contemporary Dutch. The basis for the corpus is the material from INL's 5 Miljoen Woorden Corpus 1994, 27 Miljoen Woorden Krantencorpus 1995, 38 Miljoen Woorden Corpus 1996 and the Dutch Parole Corpus. For the first release (17 January 2014) a considerable amount of more recent material was added from two newspapers: NRC Handelsblad and De Standaard (until June 2013). For the second release (June 2014) more material from these two sources has been added from July 2013 - December 2013, as well as other sources from Suriname and the Netherlands Antilles, such as newspapers, material published on internet (blog, website) and books written by Surinam authors.
Exploration:Online access with VUnet-id via SURFconext. Information about logging in via SURFconext
Annotation:lemma; part of speech
Origin:Instituut voor Nederlandse Lexicologie (INL).
Details:The composition of the corpus is such that it is questionable whether it is representative for Dutch in general as far as genre representation and balance are concerned. You should be aware of this if you want to extrapolate your corpus findings using inferential statistics.
See Also: 
Name:About the corpus
Description:Information about the composition of the corpus and the online search facility. (Login with VUnet-id via SURFconext).
Name:5 Miljoen Woorden Corpus
Description:Detailed content description of the 5 Miljoen Woorden Corpus (which is part of the CHN).
Name:38 Miljoen Woorden Corpus
Description:Detailed content description of the 38 Miljoen Woorden Corpus (which is part of the CHN).

back to overview