An analysis of ISO 639: preparing the way for advancements in language identification standards

Constable, Peter and Gary F. Simons
Part Of Series:
SIL Electronic Working Papers 2002-004
22 pages

Globalisation has led to an interest in an increasingly diverse variety of languages. Across industry, academic, and government sectors, there is a felt need for language identification standards that go beyond what are currently available. In response to these needs, ISO TC 37/SC 2 has resolved to begin a new work initiative to extend the ISO 639 family of standards.

In a revised version of a paper presented at IUC 17, we describe issues that had not been adequately addressed in existing standards, including ISO 639. These issues include the need for operational definitions for "language" and other types of category being represented, and for adequate documentation as to what each identifier denotes. These issues still remain and are of even greater importance for any attempt to create a more comprehensive standard. Accordingly, it is essential that new work on ISO 639 must include refinements to the existing standard in these regards.

To this end, we have done a careful analysis of the existing identifiers in ISO 639, and have proposed a mapping of these identifiers to languages listed in the SIL Ethnologue. In as much as the mapping has been integrated into the web edition of the Ethnologue, it documents the denotation of the ISO 639 identifiers with a degree of detail that has been previously lacking.

This paper presents the results of our analysis. The mapping from ISO identifiers to Ethnologue languages is not straightforward. This paper describes the problems we encountered and the principles we developed as a basis for decision making. It also describes issues of definition that remain to be resolved and explores potential implications for the use of existing ISO identifiers within it.

Language classification
Computer programs
language identification
ISO 639
internationalization (I18N)
information technology (IT)
