Offering over 100 languages, Stepes is betting on the development of a major machine translation infrastructure based on myriads of human interpreters
The translation industry is undergoing massive changes. Meeting the age of big data and automated, algorithmic analysis, entrepreneurs are in an arms race to create the authoritative and most globally-encompassing machine learning tool for translation to connect individuals and businesses around the planet.
It is a massive intellectual and data challenge, as popular tools like Google Translate (103 languages) or Microsoft Translator (52) rely on terabytes of data to intelligently produce machine translations (MT). Despite their massive marketshare, they are generally considered inferior to machine learning platforms that startups are bringing into the industry to better enhance an AI’s ability to transmit languages from one tongue to another.
Stepes fancies itself as an “Uber for translation” where freelancers can earn some pocket money or substantial cash offering professional services, similar to the set up of Israeli content network Fiverr.
Targeting online retailers like Amazon and Flipkart, Stepes offers item description and customer review translation to better serve sellers.
Founder Carl Yao brags to Geektime about Stepes’ mobile platform, explaining that the chat interface makes it easier for both the translators and the users to have a more natural feeling back and forth, similar to how one would converse over Whatsapp or another chat service. The ease of use presents a fast and easy way for the translators to add some cash in translators’ pockets, especially when compared to a desktop browser-based service that is harder to interact with on the go.
“Customer reviews are crucial; you want it to be translated by human translators. You can’t put out subpar-quality content.”
The human element to improving machine translations
Stepes is an outgrowth of five-year-old company TermWiki, which translates professional vocabulary and technical terms into over 100 different languages according to a list provided to Geektime. The same army of interpreters from TermWiki will now service Stepes, bringing together some 50,000 bilingual subject experts and translators from around the world.
The global machine translation market could be worth $1 billion by 2022 according to Grand View Research and Stepes is not the first service to step into it. But unlike Portugal-based Unbabel, which starts every project with a machine translation and lets humans edit the result, Stepes has it in reverse. Not every language has the benefit of MT largely because very little online content exists from which algorithms can assemble translations.
“How many people speak two or more languages? More than half of the world,” says Yao in a claim backed by the Encyclopedia of Bilingualism and Bilingual Education. “There is a huge pool of language talent that isn’t being utilized. We’re giving them a mechanism to make translation accessible and [more] easily performed.”
That is why Yao and Stepes are doubling down on bilingual subject-matter experts to work on projects between widely spoken languages like English, and far-more obscure languages from less connected areas of the world. They are building a basis for future machine translations in languages which currently have less digitized content.
“We have two ways of leveraging machine intelligence (MI): MT and translation memory (TM). Every time you translate a sentence, that sentence and its translation are saved into the database. The next time that same sentence comes up, you just pull it up. It 1) save costs, 2) preservers consistency, 3) shortens the translation cycle.”
This approach lets them offer services in languages that Google Translate and Microsoft Translator don’t offer like Cantonese (Hong Kong Chinese – ZH), Guarani, Frisian, Kirundi, Lingala and Luganda. They also provide for regional variants of Portuguese (Brazil), Spanish (Latin America), French (Quebec) and English (U.S., U.K., Australia). While they have made a lot of progress, it is still missing some that their machine-based competition do offer like Afrikaans, Haitian Creole, Kyrgyz, Sinhala and Sindhi.
“There is a severe shortage of qualified translators. There aren’t enough quality translators for gaming/entertainment, so they often use general purpose translators,” according to Yao. He points to the medical industry as another field that lacks specialization, noting that a “general purpose translator doesn’t necessarily have expertise in medical terminology.”
Stepes vs. Google Translate: just a few smartphones to get off the ground
Stepes still faces the same lack of digital content from the internet that plagues Google’s ability to bring some languages online. While some 40 million people speak Odia in India, there is no MT for it, likely because of the lack of connectivity in areas where the language is spoken. If you’re aiming to be an Uber for translation, you need to have internet and mobile penetration to bring in a strong corps of interpreters to make your machine learning authoritative.
Yao acknowledged this was an issue with the business model, but argued the simple messenger-chat-based platform will make it easier for the app to penetrate in low-income areas which will likely only have simpler cell phone models. Stepes isn’t prepared to launch a massive nanosatellite internet project like Facebook, but they’re optimistic.
“But mobile messaging, talk about tech penetration, it is the best area where tech has made advancement in even the poor parts of the world.”
But beyond that, Stepes will have another major tool in its repertoire that could give it a major boost in the industry: speech recognition.
“And with the mobile approach, speech recognition has become very advanced . . . let translators talk directly into the phone. This type of mobile-based translation solution allows thousands to translate in real time.”
This is not exactly a no-brainer. Creating speech recognition tools are massive undertakings, much more so for languages beyond the world’s more standard fare. That’s probably why the speech recognition market is valued at $6.21 billion. Instead of fuddling with thumbs, translators can go by the spoken word if they wish. But this author can see the benefits of pervasive text-to-speech translations for niche projects like preserving dying languages like Mandaic or creating language learning tools for the same tongues.
“Compared to Google Translate I think we’re ahead of the curve on those languages, because Google Translate takes a long time to develop a translation engine because they need to gather materials to extrapolate a translation. As long as we have translators with smart phones, they can immediately do quality translation.”
Yao keeps the names of most clients close to his chest, but says they include a major Eastern bank, a gaming company contracted to get round-the-clock, 24-hour translation services, and the White House project Let Girls Learn.
“Even though it’s the most commonly used, Google Translate still can’t do it accurately enough for business,” he tells Geektime, adding that, “For informal translations, I think Google Translate is great, but they simply can’t afford to rely on it.”
Stepes currently has 15 full-time employees and claims a legion of 50,000 translators for about 200 languages. Their main offices are in San Francisco, Beijing and Ahmedabad.
Referring to the industry, Yao says that, “It’s been using the same software and we think it’s time a disruptive technology can bring the industry to the modern world. It’s overdue for change. We think this mobile translation solution will do that, make it much bigger and mainstream instead of this cottage industry.”