Overview of text corpora in the Faculty of Humanities

VU University Amsterdam


A text corpus is a large collection of language data, carefully composed with the purpose of providing empirical data for a wide range of linguistic investigations. Corpora, which may consist of written language, transcriptions of spoken language, or a combination of both, can be used for qualitative as well as quantitative linguistic research.

This website provides information about the text corpora that are available to staff and students of the Faculty of Humanities of VU University Amsterdam and about the software that can (or must) be used to explore these corpora. Most corpora are accessible (exclusively) via the faculty network. A growing number of corpora, however, can be accessed via the Internet.

Corpus Gesproken Nederlands British National Corpus Corpus of Contemporary American English Dutch Parallel Corpus International Corpus of English - British component ICAME Corpora SoNaR Corpus Corpus de Referencia del Espaņol Actual CHILDES

This website is maintained by the Department of Humanities Computing, Faculty of Humanities, VU University Amsterdam. Please send comments to Eric Akkerman.


homepage Vrije Universiteit