Building an electronic database for endangered languages

Share this article 

Following the explosive development of IT, a technological revolution is taking place in the international linguistic scene-the use of technology to save the world's endangered languages. And this is probably why the National Science Foundation and the National Endowment for the Humanities in the US jointly sponsored the establishment of the Electronic Meta-structure for Endangered Language Data Project (EMELD) a few months ago. Dr Randy LaPolla, Associate Professor of CityU's Department of Chinese, Translation and Linguistics, has been invited to be a consultant for the project. As he explained, "In recent years, linguists all over the world have become increasingly aware of the urgent need to study and record languages facing extinction and make the information widely available. Building databases for the use of all interested parties will surely enhance the study and understanding of lesser-known languages. The idea of meta-structure is to have a standardized way of making the data accessible." Dr LaPolla was also recently invited to participate in the large-scale project on Endangered Languages of the Pacific Rim sponsored by the Japanese Ministry of Education (Monbusho). 

Emphasis on fieldwork

A conservative estimate of the number of existing languages in the world would be 6,000, and less than half of these languages will be effectively passed on to the next generation. The result will be that most of the languages on earth will disappear over the next two generations. The cultural heritage of the human race is enormous and incredibly rich, and yet we know very little of it, because only a tiny percentage of the world's languages have been recorded. The best way for linguists to understand the great diversity of languages is to be on the front line, engaging in fieldwork-collecting, recording, and analyzing languages, often ones that are completely alien to them.

Due perhaps to the influence of the traditional emphasis on fieldwork at the University of California at Berkeley, Dr LaPolla developed an interest in doing fieldwork in languages in 1983, when he was still a doctoral student. After that, there was no turning back, he said. "I am interested in everything related to languages, such as history, culture, physics, and human physiology. However, the most challenging thing is to investigate an entirely alien language. The experience is like putting the pieces of a puzzle together, trying to come up with a comprehensive and cohesive picture of the language. At the same time, you learn about a new culture, and a new way of thinking."

Although some of his early fieldwork experience was with Cambodian and Vietnamese immigrants in California, Dr LaPolla's interest has largely been in Sino-Tibetan languages-a huge and complex family of languages, including the two big sub-families, Chinese and Tibeto-Burman. "The Tibeto-Burman sub-family of the Sino-Tibetan language family consists of approximately 250 to 300 languages, many of which have never been recorded. The majority of these are endangered languages," he remarked. Because of this, he has focused much of his energy on recording endangered Tibeto-Burman languages, such as Qiang, Dulong, and Rawang.

"The Tibeto-Burman language family has been relatively neglected by the international linguistic community. Until recently, there weren't many scholars working on these languages and little information was available. This was largely because most communities speaking the Tibeto-Burman languages are located in remote areas which are relatively sensitive politically. They are hard to reach and research is difficult," he remarked. For example, Qiang is spoken mainly in the north of Sichuan, Dulong in the northwest of Yunnan (near Burma and Tibet), and Rawang primarily in Northern Burma. In the past, it was extremely difficult for foreigners to obtain permission to do fieldwork in these places. From 1985, Dr LaPolla applied repeatedly to do fieldwork but was not able to go until 1994, when certain political factors changed.

Building a database for endangered languages

At a micro level, Dr LaPolla's work involves primarily describing individual languages. This includes analyzing the phonetics, phonology, grammar, and discourse structure of the language, as well as the sociolinguistic situation as regards the language. At a macro level, it attempts to elucidate the entire linguistic situation of Sino-Tibetan languages from the perspectives of history, culture, and ethnic migration. He said, "I've built a large database with data on 165 languages I've collected over the last decade. This includes data on their history, morphology, and syntax." Dr LaPolla intends to use this database to write a book on the syntax, morphology, current situation, and historical development of the Sino-Tibetan languages.

Currently, he is doing the final proofreading for his books Rawang Texts with Grammatical Analysis and Rawang-English-Chinese Glossary, the fruits of his many years of fieldwork in Burma. He is also building an on-line "dialect map" and database for the Rawang, Dulong and Anong languages and cultures. It will include information on the languages (phonology, grammar, texts, language usage) and also the cultures, including pictures and video clips. It also will have a bibliography and download site for academic papers on these languages.

Last year, upon completion of A Grammar of the Qiang Language with Annotated Texts and Glossary, Dr LaPolla wanted to put aside work on the Qiang language and refocus on Rawang and Dulong. However, when he made his way again last year to the Aba Tibetan and Qiang Autonomous Prefecture in Sichuan to conduct fieldwork for the last time, he was again reminded of the impact of modern technology and other cultures on ethnic minorities, and felt there was more he should do for this language.

The Qiang nationality is spread out over a wide area, with generally only one village on each mountain, and each village often comprising only 30 families. As they are so dispersed, they are easily susceptible to the influence of other ethnic groups. Dr LaPolla said, "When I began work on the Qiang language in 1994, the villages off the main road were still largely monolingual in Qiang. But in the few years since then, most villages have electricity and many families now have televisions, and all television programmes are in Chinese. Now almost everyone, even small children, can speak Chinese. There is more access now to education in Chinese, and many more Qiang people are leaving the mountains and settling in the Han areas. Given these circumstances, I fear the Qiang language will soon disappear."

This insight not only made him rethink his earlier decision to halt fieldwork in the area, it also led him to apply to CityU and the Research Grants Council for funding to launch a large-scale survey of the Qiang dialects. He plans to conduct a thorough survey of 90 villages in three years' time. The purpose of this is to "take a snapshot" of these dialects before they disappear. He also hopes to turn the data he's laboriously collected into a search-easy "dialect map" on the Internet for use by anyone who is interested. "There are currently 80,000 people speaking this language. The project aims to record it as completely as possible before it disappears. It will also allow us to observe how a language slowly vanishes through contact with other languages. If we wait till the youth in the villages can no longer speak it, we'll be too late," he pointed out.

Interest in purely theoretical linguistic research

Dr LaPolla is also committed to purely theoretical linguistic research. He thinks that there's a complementary relationship between fieldwork and the development of linguistic theory. In 1997 he published, jointly with another linguist, a theoretical book of some of his views on linguistics, many gained from fieldwork, entitled Syntax: Structure, Meaning, and Function. He is also very interested in the familial relationships of languages and the nature of communication. His future plans include writing about the nature of linguistic communication.

As a linguist who engages in both fieldwork and theory, and who gives equal weight to micro and macro points of view, Dr LaPolla is of the view that typology is the foundation of linguistics. Although linguistics is a very broad discipline, he said, all areas of research that come under it require very comprehensive knowledge of languages. "The more languages a linguist is familiar with, the better equipped for analysis he/she will be when analyzing a language. This is especially pertinent when conducting studies related to syntactic theories. Without typology, it would not even be possible."


Fieldwork is the best way to learn about languages; Dr LaPolla emphasized the importance of a linguist engaging in fieldwork in addition to his/her theoretical research. "It is more difficult studying an unfamiliar language, but it will teach you things you wouldn't otherwise have learned."



Contact Information

Communications and Public Relations Office

Back to top