lang icon En
July 29, 2023, 1:30 a.m.
821

None

Brief news summary

None

AI tools, such as ChatGPT and Google Translate, are not accessible to billions of people in the Global South who do not use Western languages. However, researchers and startups from Africa and other parts of the world are working to change this. For instance, Mekdes Gebrewold, founder of the Ashagari consultancy in Addis Ababa, explained that even machine translations are not feasible in Amharic, her language. Consequently, many individuals like her are required to hire professionals instead of relying on tools like Google Translate. The limitation of AI-powered tools due to language barriers affects a wide range of applications, including autocomplete, transcription services, voice assistants, and content moderation on social media. Nonetheless, efforts are underway to address this issue. Modern AI tools operate as advanced autocomplete systems, using training data to predict the most probable answers. This training data, such as the Common Crawl dataset containing billions of web pages, is crucial for building AI models. However, the lack of data in certain languages poses a challenge, especially since the internet predominantly consists of a few languages, primarily English. Low-resource languages, including African, American, and Oceanian languages, account for less than 0. 1% of the Common Crawl dataset. This affects billions of people globally who speak these languages, including major ones like Hindi, Arabic, and Bengali. There is a clear bias towards European languages, with a significant overrepresentation compared to most Asian and African languages. For example, Dutch appears almost 700 times more in the Common Crawl dataset despite being spoken by a similar number of people as Amharic. However, researchers worldwide, not limited to Silicon Valley, are developing AI-powered tools for their own languages.

Asmelash Teka Hadgu, co-founder of Lesan, a startup focusing on machine translation and speech technology for Ethiopian languages, shared their approach. To overcome the lack of online resources, Lesan collaborates directly with the community, particularly students, who are passionate about their language. They curate and translate high-quality datasets from sources like books and newspapers, aligning the original and translated versions sentence by sentence to guide the machine learning process. Although they cannot compete with the vast amount of English content available, Lesan has already surpassed Google Translate's performance in Amharic and Tigrinya. Similar successful projects are being implemented worldwide for languages with smaller digital footprints. Ethnologue lists Amharic among the languages with "Vital" language support, indicating the availability of machine translation tools, spellcheck, and speech processing. However, thousands of languages, even those with millions of users, have even fewer digital tools and content. African AI pioneers, like Asmelash Teka Hadgu, are part of networks such as the Distributed AI Research Institute, GhanaNLP, and the Masakhane grassroots collective, with the aim of enabling local ownership of these technologies. Aside from Africa, researchers globally are working on languages such as Jamaican Patois, Catalan, Sudanese, and Māori. While some tech giants keep their models proprietary, initiatives like Hugging Face's global AI collective promote the sharing of insights and AI models. This empowers researchers worldwide to create solutions for their languages. As Teka Hadgu emphasized, "Talent is everywhere, opportunity is not. " By embracing and supporting researchers within their own communities, the benefits of AI technology can be harnessed and returned to those communities.


Watch video about

None

Try our premium solution and start getting clients — at no cost to you

I'm your Content Creator.
Let’s make a post or video and publish it on any social media — ready?

Language

Hot news

Dec. 12, 2025, 1:42 p.m.

Disney Sends Cease-and-Desist to Google Over AI C…

The Walt Disney Company has initiated a significant legal action against Google by issuing a cease-and-desist letter, accusing the tech giant of infringing on Disney’s copyrighted content during the training and development of generative artificial intelligence (AI) models without providing compensation.

Dec. 12, 2025, 1:35 p.m.

AI and the Future of Search Engine Optimization

As artificial intelligence (AI) advances and increasingly integrates into digital marketing, its influence on search engine optimization (SEO) is becoming significant.

Dec. 12, 2025, 1:33 p.m.

Artificial Intelligence: MiniMax and Zhipu AI Pla…

MiniMax and Zhipu AI, two leading artificial intelligence companies, are reportedly preparing to go public on the Hong Kong Stock Exchange as early as January next year.

Dec. 12, 2025, 1:31 p.m.

OpenAI Appoints Slack CEO Denise Dresser as Chief…

Denise Dresser, CEO of Slack, is set to leave her position to become Chief Revenue Officer at OpenAI, the company behind ChatGPT.

Dec. 12, 2025, 1:30 p.m.

AI Video Synthesis Techniques Improve Film Produc…

The film industry is experiencing a major transformation as studios increasingly incorporate artificial intelligence (AI) video synthesis techniques to improve post-production workflows.

Dec. 12, 2025, 1:24 p.m.

19 best social media AI tools to transform your s…

AI is revolutionizing social media marketing by offering tools that simplify and enhance audience engagement.

Dec. 12, 2025, 9:42 a.m.

AI Influencers on Social Media: Opportunities and…

The emergence of AI-generated influencers on social media signifies a major shift in the digital environment, sparking widespread debates about the authenticity of online interactions and the ethical concerns tied to these virtual personas.

All news

AI Company

Launch your AI-powered team to automate Marketing, Sales & Growth

and get clients on autopilot — from social media and search engines. No ads needed

Begin getting your first leads today