Text analysis: introduction

There are various techniques that are used to analyse texts with the help of ICT in the humanities. Unforunately, the terminology is not always the same and can differ between specific fields. This Workbench uses the following concepts to distinguish between five structurally different approaches. We are aware that there is not necessarily one 'correct' term for some techniques, which is why each technique is accompanied by a brief description.

  • Basic text analysis
    Analysis by means of searching for words, word patterns and any annotations in a text or a collection of texts, using software that also offers information about other word-related aspects, such as the frequency and the distribution of words in a text.
  • Qualitative analysis
    Content-based analysis through close reading of a single text or a limited number of texts, which involves adding annotations to the text so that it can then be analysed.
  • Content analysis
    Techniques used to investigate the communicative aspects of one or more textual expressions on the basis of objective and systematic analysis. Note: This term has an especially wide range of interpretations. This workbench uses the same definition that is used in Communication Studies.
  • Corpus analysis
    Corpus analysis is a research strategy that is widely used in language research, using so-called text corpora of authentic language material. A corpus is a digital collection of texts, text fragments and/or transcripts (of spoken language), which are selected in such a way that they form the best possible representation of a particular language, dialect or text type, making the collection as a whole a reliable source for linguistic research
  • Sentiment analysis
    Sentiment analysis aims to systematically extract, identify and characterize the emotion that goes hidden behind text.
  • Text mining
    The process of using different digital techniques to automatically retrieve valuable information large quantities of text material.
  • Stylometry
    Quantitative analysis of the stylistic features of one or more texts.