Transcription practices
The following practices aim to incorporate transcription best practices, particularly those provided by the Smithsonian Institution’s General Instructions for Transcription and Review and the Library of Congress By the People How to Transcribe, with practices that also foster digital accessibility. We have outlined our practices and the rationale behind our decisions on this page and expect to update them periodically based on user feedback and evolving standards. Please contact Archives@Perkins.org with questions or suggestions related to our transcription practices.
Abbreviations
- Practice: Abbreviations may be spelled out in published text if there is no uncertainty about the word, while manuscripts will contain any abbreviated text, as is. When kept, a note in brackets will be included after the abbreviation. Abbreviations that are commonplace, such as Mr. will be left as is.
- Reasoning: Spelled abbreviations that are known provide clarity for more readers and a better user experience for text-to-speech users. Doing so may help users who have difficulty decoding words; have difficulty using context to aid understanding, have limited memory or rely on screen magnifiers (magnification may reduce contextual clues).
- Examples:
- Perkins Annual Report: “Perkins Institution," instead of "Perkins Int.”
- Correspondence to Dr. Howe: “Perkins Inst. [Institution],” instead of "Perkins Int.”
Acronyms
- Practice: Acronyms that are certain will be explained the first time in brackets. Acronyms that may read as words will include periods to provide more accessibility for users relying on text-to-speech software.
- Reasoning: Doing so may help users who have difficulty decoding words, have difficulty using context to aid understanding, have limited memory, or rely on screen magnifiers that may hide contextual clues.
- Examples:
- “AFB [American Foundation for the Blind]”
- “I.T.S. [ITS]” instead of “ITS"
All caps and small caps
- Practice: All caps in titles or text that are not being used as emphasis are changed to sentence case.
- Reasoning: All caps and small caps make it difficult for certain users to read content correctly. All caps and small caps reduce readability for all users but are particularly difficult for users with dyslexia. Text-to-speech software may read each letter out loud instead of reading the word.
- Example:
- “Perkins School for the Blind” instead of “PERKINS SCHOOL FOR THE BLIND”
Ampersand symbol
- Practice: Ampersands in published materials will be spelled out (and) unless it is part of an official name. Ampersands in non published work will include the ampersand in brackets after the word.
- Reasoning: Not all screen readers will read an ampersand correctly and some readers with cognitive impairments may find the symbol more difficult to read.
- Examples:
- “On the corner of Fifth and Main” instead of “On the corner of 5th & Main”
- “Office of White & Berry”
- “I hope and [ampersand] pray for your safely"
Bold, italicized, or underlined text
- Practice: Transcriptions cannot be styled with bold, italicized, or underlined text. To identify text that is styled in this way [emphasized] will be placed after the word. If it is more than one word, [start of emphasis] and [end of emphasis] will be used to designate those groupings.
- Reasoning: Text that conveys emphasis is included to preserve the historical accuracy of the historical record. Including this in the transcription ensures those that are relying solely on that transcription have access to it.
- Examples:
- “you may not [emphasized] borrow the book”
- “the finances from [start of emphasis] July to October are incomplete [end of emphasis]”
Brackets [ ]
- Practice: Brackets are used to convey uncertainty or a note from the cataloger. Notes include the original text that follows abbreviated or explained text. It can also be used to describe an image in the materials.
- Reasoning: Brackets provide the user with a best guess as determined by those familiar with the materials, historical context, handwriting, and more. Brackets can also be used to provide helpful historical context or clues so researchers can interpret the materials with additional information. When balancing out accessibility and clarity with transcription best practices we have decided the abbreviated words in publications that are certain can be spelled out. Less certain text and any text in a manuscript, will include original text in brackets, following any spelled out or added words.
- Examples:
- Perkins Inst. [Perkins Institution]
- January 1 188[8]
- I will [bring] the book
- [note written in a different handwriting]
- [handwritten note: 1888]
- Gabrielle Farrell [signature]
Currency symbols
- Practice: Money symbols are left as is. If there are no money symbol brackets noting the currency and country of origin will be included in the first instance.
- Reasoning: Without a currency symbol assistive technology will read a financial amount as a number.
- Examples:
- $1,272.69
- 1,272.69 [US dollars]
Dates
- Practice: In published materials, abbreviated days of the weeks and months are written out to provide a better user experience for those relying on text-to-speech software and to avoid confusion for all users.In non published material, a note in brackets with the spelled out date will be included.
- Reasoning: Because screen readers will use the format associated with the language of the voice dates that are written in slash format, those dates will be spelled out (British users and American users for example will have the month and days swapped)
- Examples:
- 04/03/2019 [April 3, 2019]”
- Mon. June 6 [Monday June 6]
Deletions
- Practice: Crossed out or otherwise deleted text is provided within square brackets with a note about intention to exclude. Words that are intelligible will be noted as such.
- Reasoning: Strikethrough text can be difficult for users with cognitive or visual disabilities to access. Text-to-speech software may not be able to interpret the visual styling, which could lead to an incomplete or inaccurate representation of the content.
- Examples:
- “I have always loved ["vanilla" crossed out] coffee ice cream.”
- “He approached me on [unintelligible words crossed out] in Boston”
Insertions
- Practice: When text has been inserted over a line or otherwise added later, but should be read as part of a sentence, it is included in the original text in the order it was intended to be read. If the text is in another handwriting a note in brackets indicates this.
- Reasoning: Special characters may not be identified correctly or at all by text-to-speech software.
- Examples:
- “He went to Boston yesterday [“July 4”]” instead of “he went to Boston yesterday ^ July 4”
- Mrs Smith [Janet inserted in another hand] went to Washington
Legibility
- Practice: Words that aren't legible at all are noted as [illegible]. Single letters or numbers that cause uncertainty are noted in brackets.
- Reasoning: To let researchers know all or parts of a word or sentence are illegible. Brackets with words or dates filled in indicate what the transcriber thinks that the word or number might be. This is often based on contextual knowledge with a whole collection.
- Examples:
- Next Tuesday July [illegible] I’m going on a picnic
- Example: July 4, 188[4]
Line breaks
- Practice: Breaks in the text are included when it is part of the structure of the document but may be ignored if the breaks are purely decorational, overused, or complicate the reading experience. Line breaks such as titles and headings and paragraphs are preserved with a hard return. Hyphenated words that were originally broken up due to space in the original layout, are spelled out as one word. Hyphenated words that were originally broken up because they stretch across two pages, will appear as a single word on the first page only.
- Reasoning: The information is prioritized over the visual design, as that is most likely what researchers are interested in. Keeping the text as intended rather than as constrained by the layout will make navigating the text easier for more readers. Soft returns are not often read by screen readers so aren't used to replicate any of the layouts being transcribed.
- Examples:
- “Perkins Institution" instead of "Perkins Inst-itution”
- “Boston Representative for the Wm. [William] Bourne & Son Pianos” instead of “Boston Representative [line break] for the Wm. [William] Bourne & Sons [line break] Pianos”
Marginalia
- Practice: Handwritten notes added to the document marginalia at a note at the end of the document are included. If the author of the marginalia is known, that is indicated in the note.
- Reasoning: Notes are included to provide access to handwriting, or other parts of an asset in an effort to provide more equitable access to the resource and provide contextual clues, or potentially contextual clues to a wider group of users.
- Examples:
-
- [Notes in marginalia likely written by Polly Thomson reads...“]
- [Notes in the marginalia written in a different hand reads, “1896” and “Mr. Anagnos”]
Non-English languages, characters, and translation
- Practice: The language is presented as is. Include accents only if included in original text. English translations are provided in brackets. Non-English language materials are transcribed in the language with English translations being provided below that copy or in a transcription metadata field if the content is entirely in a language other than English.
- Reasoning: Translations will provide access to the information, appear in English searches, and still be available in its native language and to native readers.
- Examples:
- école
- école [school]
Non-text features
- Practice: Notations about the record are made in brackets after the original text or in a place that makes sense when read aloud. They are located in areas that favor understanding over duplication of exact location. Purely decorative elements that break up a page, such as lines are not described. Images are described.
- Reasoning: Purely decorative elements are not included in transcription because they serve no informational value. Letterhead, stamps, doodles, or other imagery will be described so that all users can have access to at least some of that information and decide for themselves if it is relevant or not. More details about images can be requested from Archives staff.
- Examples:
- [Stamp postmarked Jul 5 1888]
- [American Printing House for the Blind letterhead featuring the logo and illustration of the headquarters]
Parentheses ( )
- Practice: Parentheses are used at the end of a name to provide birth and death dates. Parentheses are used to provide a maiden name. Date ranges that are more informally presented on the site will be spelled out.
- Reasoning: Parentheses are a standard form of indicating both birth and death dates after names and for providing maiden names in genealogy research.
- Examples:
- Laura E. (Howe) Richards (1840-1943)
- Samuel Gridley Howe (1850 – 1943) was director of Perkins from 1829 to 1876
Roman numerals
- Practice: Roman numerals are spelled out in published and unpublished materials. Unpublished materials will include brackets with a note that it was a roman numeral to let researchers know Roman numerals were originally used.
- Reasoning: Roman numerals can be problematic for users relying on screen readers. Screen readers often have trouble differentiating between letters and numbers and consequently increase the likelihood that the numbers will be incorrectly read by text-to-speech tools. Roman numerals appearing in manuscript or correspondence may include a note after the number, [in Roman numerals] to let researchers know Roman numerals were originally used.
- Examples:
- “5 [in Roman numerals]” not V
- “Hamlet Act 4, Scenes 5 and 6” not “Hamlet Act IV: Scenes V & VI”
Spelling and punctuation
- Practices: Original spelling, grammar, punctuation, and word order are preserved even if it is grammatically incorrect. The correct spelling of a word is included in double brackets next to the incorrectly spelled word if it is a name or may otherwise be integral to searches. The full names of women will be included if known.
- Reasoning: Incorrect spellings, particularly of names, hides them in searches. Historically married women are written as “Mrs. First and last name of their husband.” In order to undo this erasure, the Archives strives to provide the full names of these women whenever possible.
- Examples:
- Anagnous [Anagnos]
- Mrs. John Smith [Jane Sullivan Smith]
Symbols and special characters
- Practice: Symbols and other special characters are spelled out. Consult Ampersands and Currency symbols entries for practices involving these symbols.
- Reasoning: Special characters and symbols can make web content difficult to read for users with disabilities, especially those who use assistive technologies.
- Examples:
- "etcetera" instead of "Ect."
- “Section 14” instead of “§ 14”
- “Edward Bradley [foot note states, 'Only one in two studies' ]” instead of “Edward Bradley*” with foot note at the bottom of the page.
- “only one in two studies [footnote reads, "located on ]” instead of “only one in two studies³”
- “company [trademarked]“ instead of “company™”
- “Second” instead of “2nd”
Tables
- Practice: Tables are transcribed in a manner that conveys the information rather than original design. If noting information in a column is not necessary, it won't be included. Line breaks, rather than symbols such as the pipe symbol (|) or slashes (/) will be used.
- Reasoning: Tables are designed to convey information and doing so in the clearest way possible provides clarity for more readers and a better user experience for text-to-speech users. Doing so may help users who have difficulty decoding words; have difficulty using context to aid understanding, have limited memory or rely on screen magnifiers (magnification may reduce contextual clues).
- Example:
- [First column] To paid Committee's orders, $4,998.04 [US dollars]
- [sum] 61,279.43 To balance carried over, 214.01
- [Second column] By balance brought forward from old account, $1,272.69 US dollars
- [First column] To paid Committee's orders, $4,998.04 [US dollars]
Text order
- Practice: Transcribed text is generally in the order it appears on the page. Preference on ease of reading is placed over maintaining design elements or matching the visual layout exactly. This is most commonly practiced when transcribing columns or advertisements.
- Reasoning: Doing so provides clarity for more readers and a better user experience for text-to-speech users. Doing so may help users who have difficulty decoding words; have difficulty using context to aid understanding, have limited memory or rely on screen magnifiers (magnification may reduce contextual clues).