AI tools, such as ChatGPT and Google Translate, are not accessible to billions of people in the Global South who do not use Western languages. However, researchers and startups from Africa and other parts of the world are working to change this. For instance, Mekdes Gebrewold, founder of the Ashagari consultancy in Addis Ababa, explained that even machine translations are not feasible in Amharic, her language. Consequently, many individuals like her are required to hire professionals instead of relying on tools like Google Translate. The limitation of AI-powered tools due to language barriers affects a wide range of applications, including autocomplete, transcription services, voice assistants, and content moderation on social media. Nonetheless, efforts are underway to address this issue. Modern AI tools operate as advanced autocomplete systems, using training data to predict the most probable answers. This training data, such as the Common Crawl dataset containing billions of web pages, is crucial for building AI models. However, the lack of data in certain languages poses a challenge, especially since the internet predominantly consists of a few languages, primarily English. Low-resource languages, including African, American, and Oceanian languages, account for less than 0. 1% of the Common Crawl dataset. This affects billions of people globally who speak these languages, including major ones like Hindi, Arabic, and Bengali. There is a clear bias towards European languages, with a significant overrepresentation compared to most Asian and African languages. For example, Dutch appears almost 700 times more in the Common Crawl dataset despite being spoken by a similar number of people as Amharic. However, researchers worldwide, not limited to Silicon Valley, are developing AI-powered tools for their own languages.
Asmelash Teka Hadgu, co-founder of Lesan, a startup focusing on machine translation and speech technology for Ethiopian languages, shared their approach. To overcome the lack of online resources, Lesan collaborates directly with the community, particularly students, who are passionate about their language. They curate and translate high-quality datasets from sources like books and newspapers, aligning the original and translated versions sentence by sentence to guide the machine learning process. Although they cannot compete with the vast amount of English content available, Lesan has already surpassed Google Translate's performance in Amharic and Tigrinya. Similar successful projects are being implemented worldwide for languages with smaller digital footprints. Ethnologue lists Amharic among the languages with "Vital" language support, indicating the availability of machine translation tools, spellcheck, and speech processing. However, thousands of languages, even those with millions of users, have even fewer digital tools and content. African AI pioneers, like Asmelash Teka Hadgu, are part of networks such as the Distributed AI Research Institute, GhanaNLP, and the Masakhane grassroots collective, with the aim of enabling local ownership of these technologies. Aside from Africa, researchers globally are working on languages such as Jamaican Patois, Catalan, Sudanese, and Māori. While some tech giants keep their models proprietary, initiatives like Hugging Face's global AI collective promote the sharing of insights and AI models. This empowers researchers worldwide to create solutions for their languages. As Teka Hadgu emphasized, "Talent is everywhere, opportunity is not. " By embracing and supporting researchers within their own communities, the benefits of AI technology can be harnessed and returned to those communities.
None
The artificial intelligence (AI) market within the social media sector is experiencing remarkable growth, with forecasts predicting an increase from a market value of 1.68 billion US dollars in 2023 to an estimated 5.95 billion US dollars by 2028.
Epiminds, a marketing technology startup, is betting that AI can help marketers accomplish more.
It’s time to get ahead in AI + B2B—not next quarter or next year, but right now.
Machine learning (ML) algorithms are increasingly vital in Search Engine Optimization (SEO), transforming how businesses improve search rankings and content relevance.
xAI, an artificial intelligence company founded by Elon Musk, has quickly become a major player in the AI field since its creation.
Deepfake technology has seen significant advancements in recent years, enabling the creation of highly realistic manipulated videos that convincingly replicate real people and scenarios.
Elon Musk’s AI company, xAI, is making a significant foray into the video game industry by leveraging its advanced ‘world models’ AI systems, designed to comprehend and interact with virtual environments.
Automate Marketing, Sales, SMM & SEO
and get clients on autopilot — from social media and search engines. No ads needed
and get clients today