- Digital humanities
- From source to data
- Data collection
- Digital data
- Data analysis
- Tools
- Devices
- E-resources
- Special topics
- Digital heritage
- Digital archaelogy
- E-literature
- Scholarly editing
- Language technology
|
Home page > Data analysis > Text analysis
Functions of text analysis software
The current generation of text analysis programs have a range of functions,
of which the following are the most common. Note: In the following, 'text'
may also be read as 'collection of texts'.
-
The production of frequency lists
A frequency list
is a list of words indicating the frequency of every given word
in a text. These overviews can be presented in different ways: ascending or
descending by frequency, alphabetically, or as a
retrograde word list.
-
The production of concordances
A concordance
is an overview of all the words in a text, or a selection of words, which also
provides the location and the immediate context of every word. Content size
can usually be set by the user, and the concordance can also be ordered by the
context to the left or to the right of the word. Example of a concordance of the word forms grow, grew and
grown in the novel Alice in Wonderland ordered by right context.
-
Searching for words and phrases
There several ways to search for words or phrases in a text. It is often
possible to use so-calledwildcards to search for words that start or
end with certain letters (e.g. all words that start with love
or end with ness). In addition, it is
also possible to require or exclude combinations with certain other words, as
well as indicating multiple words of interest (alternation: 'word A' or '
Word B' or ' word C'). The output of the search query is usually shown in the
form of a concordance.
-
Plotting words
This displays a graphical overview of the places
in which a word or phrase occurs in the text, thus showing how the words are
distributed over the text
-
Analysing word combinations
Here, software analyses which
other words a certain word is typically combined with. This can involve simply
counting common phrases, but also establishing collocations. We speak of a
collocation if two or more words occur together more often than might be
statistically expected.
-
Investigating text-specific vocabulary
This involves investigating which words in a text occur only in that text. This is usually
done by means of a statistical comparison of the frequency list of the text in
question with a frequency list that is based on a large collection of other
texts (the so-called reference file).
-
Visualization
Although plotting words (see above)
has long been a standard function of text analysis software, modern programs
also have all kinds of extra functions that allow one to visualize various aspects of
word usage in a text, for example by means of word clouds, bubble lines,
scatter plots and networks. Voyant
Tools is an example of a program that has a lot of functionality for text
analysis, including many visualization features.
|
|