Corpus details

Stevin Nederlandstalig Referentiecorpus

Official name:Stevin Nederlandstalig Referentiecorpus
Common name:SoNaR Corpus
Language type:written
Corpus type:general / reference
Period:1954 - 2011
Size:247 million words
Description:The STEVIN project SoNaR aims to build a 500-million word balanced reference corpus for contemporary (1954-present) written Dutch. Besides comprising no less than 38 text types, the corpus will also be balanced according to the number of speakers in Dutch-speaking regions, one-third of the texts coming from Flanders, and two-thirds from the Netherlands. Not only texts from the more conventional text types will be gathered such as newspapers, reports, etcetera, but also data coming from new media such as chat, SMS, internet fora and email.
At present the 2nd intermediate release of the SoNaR corpus is available in our faculty, containing more than 247 million words.
Exploration:There is no dedicated exploration software for SoNaR. Please contact Eric Akkerman for further details.
Origin:Radboud Universiteit Nijmegen, Universiteit Tilburg, Universiteit Twente, Hogeschool Gent, Katholieke Universiteit Leuven, Universiteit Utrecht.
Edition:2nd intermediate release (2011)
Location:Faculty network (on request)
Details:SoNaR documentation can be found on the Faculty network in the folder
See Also:
Description:SoNaR project site.
Name:D-Coi project site
Description:D-Coi was was a preparatory project for the SoNaR project. The D-Coi material is incorporated in the SoNaR-corpus.

back to overview