Rapid Word Collection: An updated approach to dictionary creation

Speakers of the Buli language recently participated in a Rapid Word Collection workshop in Sandema, Ghana.

Ask someone for his word for “roof” and you will collect one word. Ask him, “What are the parts of a house?” and you will get “roof, wall, door, window, eaves” and other words that you wouldn’t expect. In one language there is a word for the area under the eaves of a house (a nice place to be when it is raining). An outsider would never think to ask for such a word.

- Ron Moe

(May 2012) Far from being a minor by-product of a language development program, a dictionary is a powerful tool for supporting every aspect of language development. However, the time and effort required have often made dictionary publication a daunting prospect, causing it to be relegated to secondary status. SIL linguist Ron Moe has devoted years of study to this problem and developed a method known as the Dictionary Development Process (DDP). Eleven years since DDP was first introduced, a new research group has been formed to evaluate its successes, refine its methodology and promote its use.

DDP makes lexicography (dictionary development) accessible not only to researchers, but also to community members without extensive training in linguistics. To make this already promising method even more effective, Moe and his colleagues focused on one particular aspect of DDP–the Rapid Word Collection (RWC) workshop. During this carefully structured workshop, community members (mother tongue speakers of the language) collect words from their language, prompted by questions related to a selection of semantic domains (families of closely-related words). The nearly 1,800 domains can be grouped into nine main categories:

  1. Universe, creation
  2. Person
  3. Language and thought
  4. Social behavior
  5. Daily life
  6. Work and occupation
  7. Physical actions
  8. States
  9. Grammar

This focused effort to collect words is one of DDP’s most notable strengths. In the past, a researcher might make note of a few new words each day while engaged in language learning or language development activities. Moe contrasts the results of the traditional model and the Rapid Word Collection component of DDP:

The traditional method of word collection resulted in an average of 2.5 words being collected per working day. So, in two weeks a person might collect 25 words. In a year, he might collect 650. It would take 30 years to collect 15,000… Now we are routinely collecting over 10,000 in two weeks and sometimes as many as 20,000. What’s more, we can do this at the beginning of a language development project.

A Rapid Word Collection workshop is the first step in the Dictionary Development Process. The workshop begins with a day of training and practice. Each workshop participant has an assigned role. Data entry (using SIL’s WeSay and FieldWorks software packages) is built into the process so that when the workshop ends, each entry is glossed and classified by semantic domain. After the workshop, the data is further refined and the entire database is uploaded to the Internet where it is accessible for continued research and development. The whole process, from training of the participants to transmission of the data, requires only three to four weeks. The community can continue adding to the dictionary and work toward publishing it in a format that meets their needs.

Buli workshop participants devised an innovative visual to track progress: each guinea corn graphic added to the grain silo represents one hundred words.In early 2012, Moe and the RWC Research Group led a workshop for the Buli language of Ghana in cooperation with community members and the Ghana Institute of Linguistics, Literacy and Bible Translation. Approximately 15,000 words and senses (different meanings of a word) were collected, resulting in a database of nearly 10,000 unique entries. The researchers were able to observe the process in action and implement proposed adjustments to the method. The RWC Research Group met again in April to debrief the experience and discuss ways in which the system might be further improved.

At least 200 language communities have used the Rapid Word Collection method, resulting in a number of dictionaries. Moe and his colleagues hope to see this system used to launch dictionary projects in many of the approximately 3,000-4,000 of the world’s languages which have not yet been documented in the form of a dictionary. In turn, these new dictionaries will be available to support comprehensive language development efforts, making dictionary development a strategic investment of time and resources.


Above: Buli workshop participants devised an innovative visual to track progress: each guinea corn graphic added to the grain silo represents one hundred words.


Related links of interest