Overview of text corpora in the Faculty of Humanities

Overview of Multilingual corpora

Click on the name of a corpus to view details.
A globe [] indicates that the corpus can be accessed through the Internet.

NameDescription
C-Oral-Rom Multilingual corpus of spontaneous speech for four romance languages, including French and Spanish.
Child Language Data Exchange System
(CHILDES)
Corpus material related to first language acquisition.
Dutch Parallel Corpus
(DPC)
DPC is a parallel corpus of 10 million words containing the language pairs Dutch - English and Dutch - French.
ESF Corpus Spontaneous second language acquisition data of adult immigrant workers (Arabic > Dutch and Turkish > Dutch).
TalkBank TalkBank is a multilingual corpus containing sample databases from within several subfields of communication.