You are here

UofA Lab Develops Tools to Foster Minority Languages

Spotlight On:

Antti Arppe
University of Alberta
Quantitative Linguistics

Many people take for granted the use of tools such as spell-checkers, electronic dictionaries and language learning software. These applications have become integrated into our daily lives, helping complete tasks from sending text messages to reading the newspaper to traveling to another country. However, while these technologies are available for the majority of languages (e.g. English, French, Chinese), they have so far been created for only a few minority languages.

Antti Arppe, Assistant Professor in Quantitative Linguistics at the University of Alberta, is hoping to rectify this by developing a series of technical language tools for minority languages, using Plains Cree as the spearhead language. Arppe founded the Alberta Language Technology Laboratory (ALT-LAB) which is leading a research project titled 21st Century Tools for Indigenous Languages.

“This research is focused on supporting the revitalization and sustained day-to-day use of indigenous languages in all spheres of life by developing modern language technology for these languages in partnerships with their respective communities, having Plains Cree (Nêhiyawêwin) as the spearhead example case,” says Arppe.

The project also involves the collection and digitization of textual resources into corpora, both as a means of tool-testing and as a research objective of its own, using the latest computational models to extract the most information out of the data. The ALT-LAB relies on the Advanced Research Computing (ARC) resources of WestGrid and Compute / Calcul Canada for the management and storage of this data.

“Compute / Calcul Canada infrastructure is used to provide long-term back-up storage for the invaluable linguistic resources, for instance recordings of spoken Indigenous language use, which we are gathering and creating throughout the duration of the project and beyond,” says Arppe.

With the compilation of text collections and the creation of linguistic tools for their analysis as well as amassing new experimental evidence on a number of diverse indigenous languages, Arppe hopes to be able to apply the most recent advancements in statistical and computational methods on hitherto understudied material. The study of such extensive new data representing multiple sources of linguistic behavior has the potential to substantially alter how language is understood, testing the general validity of current linguistic theories and even possibly revising them. 

“My ultimate goal is to play a part in advancing our understanding of language as the multi-faceted and multidimensional phenomenon that it is,” says Arppe.

Arppe also hopes that providing minority language speakers with these tools will facilitate and help grow the use of these languages in all spheres of life by the members of these communities.

“The retention of native languages is integral to the empowerment, cultural vibrancy and prosperity of Aboriginal communities, allowing for the continuation of traditions of thought and experience developed among indigenous peoples,” says Arppe.

Arppe notes the importance improvements in technology and access to ARC resources has had on his research. 

“This research would not be possible without Compute / Calcul Canada infrastructure due to the need for proper archiving and back-up storage at this large scale and time frame,” says Arppe.

The project is being carried out in collaboration with Giellatekno and Divvun (University of Tromsø, Norway), the Cree Literacy Network, scholars at the First Nations University of Canada and other academic institutions in Canada, and Cree speaking communities in Alberta.