Search Engines: Closing the Gap for Minority LanguagesThousands of the world’s languages are spoken by fewer than 100,000 people. At COSM 2021, Phil Parker outlined a plan for giving them access to information
We’ve all consulted “Dr. Google” for a health ailment or to find a recipe or learn how to fix something perhaps. Sometimes helpful, sometimes not. But what if you asked Google something — and it didn’t even recognize your language?
Phil Parker, speaking at COSM 2021, told the story of a woman in Ethiopia searching for “lump in breast,” using one of the over 80 languages or dialects spoken in the region. Her language was one of thousands spoken by only a comparatively small population.
The search engine did not recognize her input and returned no hits. She tried her query in Swahili, but there was nothing she found informative about “breast lumps” in Swahili. She finally tried her search using English. Multiple hits appeared. In effect, she had to understand English to access content on the web.
More than a third of the perhaps 6000 languages in the world are spoken by fewer than 100,000 people. (Unesco) Twenty-five percent of the perhaps 6000 languages in the world are spoken by fewer than 1,000 people. (BBC) Added up, that’s a lot of people who may be shut out of vital information online.
Is there a way for this woman to gain access to information in her own language?
Parker, INSEAD Chaired Professor of Management Science and founder of Botipedia, presented his solution at the COSM 2021 conference last week. His system combines a decentralized “content engine” with natural language understanding.
In previous work, Parker developed an automated system that translates textbooks and other educational materials into languages spoken by small populations. This work caught the eye of the Bill and Melinda Gates Foundation, which funded a project that took Parker’s automated authoring processes and translated agricultural and local weather information. Now about ninety countries have access to his weather platform.
In his presentation, Parker pitched his idea for algorithmically filling the translation gap for all the other content on the web. Botipedia uses natural language learning and algorithmic search engine sifting to create what he calls a “content engine.” He predicted that this kind of decentralized and privatized system will be the future of the internet after Google: “But the hypothesis is that there will be content engines that write search engines in real time that you own, and you have full privacy because no log file will exist.”
Here’s a more detailed explanation of Botipedia:
You may also wish to read: If Google thinks for you, use THEIR search engine. Otherwise… Google’s monopoly affects the free exchange of ideas in the public square and our electoral process. Brave Search offers the first true alternative to Google since Bing by introducing a third English language index and protecting user privacy. (Caitlin Bassett)