Digital Humanities Workbench |
Home page > Data analysis > Text analysis > Stylometry StylometryStylometry is the study of measurable features of style, such as word and sentence length, various frequencies (of words, word lengths, word forms, etc.), vocabulary richness, use of punctuation, use of certain expressions and preferences for certain spelling variants. Statistical analysis has always been an important pillar of stylometry. Techniques in the field of artificial intelligence are now also being employed for stylometry. An important application of stylometry is authorship attribution, in which individual style elements of one or more texts are examined, in order to determine who is responsible for creating that text(s). Authorship attribution tries to help answer questions such as:
Authorship attribution also plays an important role in forensic linguistics. An often-cited application of stylometry is determining the authorship of the " Federalist Papers", a series of articles published in 1787-88 with the aim of promoting the ratification of the new United States Constitution. They were written by three authors, Jay, Hamilton and Madison, under the pseudonym “Publius”. We knew the author of some articles, but the authorship of others was still under debate. In the early 1960s, researchers Mosteller and Wallace used stylometric methods in an attempt to resolve this uncertainty. An interesting example of a relatively simple program in this area is Signature, which has been developed to support stylometric analysis and text comparison, with special attention for authorship attribution. More advanced programs for stylometric analysis are stylo and the statistiscal package R. However, these programs are more complex to install and use.
More informationGrieve, J.W. (2005). Quantitative authorship distribution: A history and an evaluation of techniques. Master thesis, Dept. of Linguistics, Simon Fraser University.http://summit.sfu.ca/system/files/iritems1/8840/etd1721.pdf (consulted on 15-3-2016). |
Other topics in this section: Introduction Basic text analysis Qualitative analysis Content analysis Corpus analysis Sentiment analysis Text mining |