From Amharic to Zulu: Cohere’s new multilingual AI model supports more than 70 languages

Toronto scaleup says Tiny Aya can run in regions where large-scale infrastructure isn’t always available.

Toronto AI company Cohere has released a suite of new multilingual AI models that support more than 70 languages on any device, even offline. 

“The future of multilingual AI … will be a vibrant ecosystem of many models, shaped by many voices.” 

The base Tiny Aya model (Tiny Aya-Base) contains more than 3.35 billion parameters (the settings that control an AI model’s output and behavior) containing the data for languages like Amharic, German, Latvian, Tagalog, and Zulu. Those 3.35 billion parameters are a small number compared to well-known large language models like ChatGPT, which have hundreds of billions of parameters.

Tiny Aya-Base powers the instruction-tuned TinyAya-Global model, which Cohere released on Tuesday alongside several specialized models for specific regions of the world. TinyAya-Earth is strongest for languages across the African and West Asian regions; TinyAya-Fire is strongest for South Asian languages; and TinyAya-Water is strongest for the Asia-Pacific and European regions. 

Cohere said in a blog post that this approach “allows each model to develop stronger linguistic grounding and cultural nuance,” resulting in systems that “feel more natural and reliable for the communities they are meant to serve.”

“The future of multilingual AI will not be one giant model,” the blog post reads. “It will be a vibrant ecosystem of many models, shaped by many voices.” 

Cohere said Tiny Aya is designed to run on local devices, in classrooms, in community labs, and in regions where large-scale infrastructure isn’t always available, with the intent of bringing “high-quality AI” closer to researchers working on underrepresented languages and developers building locally. The company said the model could be used by a university lab as an offline translation or AI education tool in classrooms and community settings, without having to rely on cloud APIs. 

RELATED: As Google and Cohere expand multilingual AI offerings, experts warn of “plausible BS”

Cohere has tried to set itself apart from its competitors with a focus on the multilingual capabilities of its models. ex-Cohere Labs head Sara Hooker told BetaKit in 2024 that their research is working to improve the quality of LLMs in languages other than English, as those languages suffer partly due to a lack of available high-quality training data.

Cohere has “seen a lot of attraction” for models that are adept at various languages, chief AI officer Joelle Pineau told media at a dinner BetaKit attended earlier this month. The company has struck multiple international partnerships, such as with Japanese firm Fujitsu, and opened new offices around the world over the past year. 

“We have quite a few customers in Asia, in Korea, in Europe, who really value the fact that the model is actually competent in their local language,” Pineau said at the dinner. 

Founded in 2019 by former Google researchers, Cohere builds the LLMs that power chatbots and other AI applications for companies and government agencies. Cohere has raised $600 million USD from investors, including Nvidia, and last year hit a valuation of $7 billion USD. 

The company also reportedly hit $240 million USD in annual recurring revenue last year, significantly overshooting earlier projections of $200 million USD. At a Bloomberg Tech conference in October, CEO Aidan Gomez said Cohere might go public “soon.”

The Canadian government, in its quest to support domestic AI companies, has touted Cohere as an AI “champion” and has partnered with the company to use its AI services. 

With files from Madison McLauchlan and Josh Scott. 

Feature image courtesy Cohere.

0 replies on “From Amharic to Zulu: Cohere’s new multilingual AI model supports more than 70 languages”