Overview of text corpora in the Faculty of Humanities

Overview of Dutch corpora

Click on the name of a corpus to view details.
A globe [] indicates that the corpus can be accessed through the Internet.

Child Language Data Exchange System
Corpus material related to first language acquisition.
Corpus Gesproken Nederlands
Large corpus of spoken Dutch with various types of annotation.
Corpus Hedendaags Nederlands
CHN is a monitor corpus for contemporary Dutch.
Corpus Hermans W.F. Hermans' novella "Het Behouden Huis", analysed and coded to facilitate research of word order in Dutch.
Corpus Oudnederlands All known old Dutch textual material from the period 475 - 1200.
Corpus Renkema Language of civil servants ("ambtelijke taal").
Corpus Uit den Boogaart
(Eindhoven corpus)
Written and spoken Dutch produced between 1960 and 1973.
Dutch Parallel Corpus
DPC is a parallel corpus of 10 million words containing the language pairs Dutch - English and Dutch - French.
Dutch PAROLE Distributable Corpus Written Dutch corpus consisting of various text types.
ESF Corpus Spontaneous second language acquisition data of adult immigrant workers (Arabic > Dutch and Turkish > Dutch).
Stevin Nederlandstalig Referentiecorpus
(SoNaR Corpus)
SoNaR is a 500-million-word reference corpus of contemporary written Dutch.
TalkBank TalkBank is a multilingual corpus containing sample databases from within several subfields of communication.
VU Chatcorpus
(ChatIG Corpus)
Dutch corpus consisting of controlled chat sessions by secondary school pupils of different age groups.