Artificial intelligence is not just reshaping industries; it’s also becoming pivotal for preserving languages endangered by globalization and technological change. Recently, collaborations between companies like Orange, OpenAI, and Meta have ignited interest around leveraging AI technologies to save the world’s vanishing languages.
Orange, the French telecommunications giant, announced on November 26, 2024, its partnership with OpenAI and Meta to bolster AI models for African regional languages—languages currently left unsupported by generative AI. This initiative aims at using AI as a tool for greater digital inclusion, particularly for populations where language barriers are prevalent.
Orange plans to focus initially on languages such as Wolof and Pulaar, which are spoken by over 22 million people across West Africa. These languages not only play significant roles culturally, but their preservation is also pivotal as global societies become increasingly interconnected.
OpenAI, known for its advanced AI systems, including its large language models (LLMs), is working to fine-tune its technology for these under-resourced languages. The collaboration, part of Orange’s wider commitment to social responsibility, aims to nurture conversational capabilities for potential customer interactions, educational tools, and more.
But the work to preserve endangered languages doesn’t stop there. Researchers and linguists are recognizing the unique opportunity AI affords for revitalizing these languages by digitizing them and making them more accessible. David Adelani, from McGill University, highlights this connection: “If your language doesn’t have a lot of text online, it will be less represented,” he explains, underscoring the importance of language data of which many endangered languages have been stripped due to their limited online presence.
Adelani's comments highlight the crux of the matter; many of the Earth’s languages lack the requisite data needed to train AI models effectively. With estimates indicating half of the world’s languages are being inadequately represented online, the task appears Herculean. This is particularly grave for languages such as Amharic, where even the attempts to train AI models received zero accuracy scores—pointing to the urgent need for data standardization.
Despite these challenges, initiatives like Google’s 1,000 Languages Initiative and the endearing Woolaroo app demonstrate hope. Launched to support the 1,000 most spoken languages globally, this initiative seeks to create AI models capable of supporting underrepresented languages and enriching cultural identities.
Woolaroo operates by allowing users to identify objects through their smartphones and retrieving translations spoken across multiple Indigenous languages. This modern feat of technology aims to engage younger generations with their linguistic heritage and emotional connections to their roots.
While many believe the race to save endangered languages may be incomprehensible, the impact of AI design — built on community engagement and technical innovation — shows promise. Uche Okonkwo, Responsible AI program manager at Google, emphasizes the importance of working with communities to gather relevant language data. “We’ve been focused on how to factor these findings throughout the model development lifecycle,” she said.
The ability to digitize languages, create educational materials, and develop AI applications could hinge on successful collaborations between technologists and linguistic communities, aiming to find new models suited for less data-intensive applications.
For companies like Devnagri AI, this means addressing language barriers effectively through new multilingual conversational AI technologies capable of effortlessly integrating across various digital platforms. Devnagri has launched its multilingual AI, promoting engagement and communication across more than 40 local and international languages.
“Language should never be a barrier,” said Devnagri co-founder Nakul Kundra, expressing the company's goal to empower brands to connect with diverse customer bases. The AI models are finely tuned to reflect local terminologies and nuances, delivering human-like interactions.
This exciting development not only revolutionizes customer engagement strategies but also actively contributes to cultures at risk of fading. The gravity and potential of these initiatives lie not merely within the language itself, but also within the intertwined identities people have with their languages.
Global collaborations signal forward-thinking. The potential AI holds for saving endangered languages is extensive, but challenges remain. For each ambitious innovation, the looming question remains: will they be completely accessible to those most affected?
From developing models refined by domain-specific data to garnering community involvement — the groundwork is being laid. Initiatives focused on AI for endangered languages present methods toward not only preserving communication but also facilitating educational opportunities to engage potential future speakers.
Whether through the academic world, government initiatives, or tech developments, combining traditional conservation methods with modern technology might beat the odds of extinction for languages once believed to be lost causes. The questions posed by the AI age are clear, and the answers will rely not just on the technology itself — but on the communities who speak these languages.
A confluence of passions drives the preservation of languages, making technology just one piece of the puzzle. The future may yet hold innovative solutions to keep these languages alive, as the race to develop, word by word, becomes as much about people as it is about the AI tools created to serve them.
With the AI revolution underway, focusing on how it can bridge language gaps and resurrect what was once on the verge of fading is now more pertinent than ever. Such endeavors depend heavily on partnerships, data ownership, and most critically, community engagement to guide the research, development, and implementation of these transformative technologies.