The Linguist's Shoebox
Integrated data management and analysis for the field linguist
Tip
Text-based linguistics is a solid foundation for understanding a language and culture.
Shoebox integrates the text, lexical, and cultural data produced by text-based analysis (e.g., through interlinearizing, jumping and data links, word lists and concordances). While this should not be your exclusive strategy for building a good lexical database, it is highly productive and reliable as a primary source of words, for checking senses, and for investigating semantic and grammatical collocations. Here is a caution, if you build a lexicon primarily through interlinearizing texts: morpheme-level glossing tends to encourage researchers to ignore compounds and phrasal lexemes and to overlook sense discrimination. After interlinearizing a textparsing it by morphemesyou should pass through the text a second time to identify polymorphemic words, compounds, and phrases. Enter them into the lexical database as separate headwords. For more information: Read page 8 and section 4.4 in Making Dictionaries.
Index of tips:
data analysis;
interlinear text;
lexicography;
text files
List of tips
