![]() |
Digital Humanities Workbench |
Home page > From source to data > Transcription > Text Transcription of textIf you want to use the computer to analyse textual sources, the digital images of those sources must be converted to computer readable text. For printed documents that are relatively recent, this can often be realised by optical character recognition (OCR). For printed historic documents, however, OCR often does not produce satisfactory results, although progress is certainly being made in this area in the last decade (see the section about digitisation for more information about OCR). For handwritten documents (like historical manuscripts, letters and children's writing), OCR usually is very problematic, if possible at all.
Collaboration and crowd sourcingAs with many modern applications, transcription can be done online, which enables groups of students and/or scholars to work together on the transcription of a single (larger) document or a collection of documents. For a growing number of larger transcription projects (usually conducted by academic departments, libraries or digital archives), this is not restricted to the research group, but all interested individuals are asked to participate. Examples of such crowd sourcing transcription projects are Transcribe Bentham (a double award-winning collaborative transcription initiative, which is digitising and making available digital images of this unpublished manuscripts of this philosopher and reformer through a platform known as the Transcription Desk), Making History - Transcribe (Virginia Memory) and Smithsonian Digital Volunteers, but nowadays there are many more projects of this kind. Usually this transcription method implies a workflow in which all participants may be involved in transcription and the reviewing of the work of others, followed by a final check and approval by the project team.
ToolsYou can make a transcription of a document by opening two windows: one in which the digital image is displayed and one in which you transcribe it with an editor (as txt, HTML, XML or rtf / docx). However, a number of dedicated tools is available to support the transcription process.
Transcript
Transkribus
Further reading
|
Other topics in this section: Speech |