10 July 2015

Tech.pinions: “The Voice UI”

As interesting as voice is as an interaction layer to most of us in the developed world, it may evolve to become central to those in the third world, particularly with things like smartphones. One of the primary problems, besides economics, to connecting the next billion humans to the internet is a lack of technical literacy and often the lack of literacy at all. There are massive pockets of humans who live in villages with maybe one TV and radio. Which brings up the interesting question of how would they use a smartphone even if they could afford one and the data plan attached to it? This is where things like voice as a user-interface may provide a solution.

Ben Bajarin

I’m constantly amused by these ‘analyst’ types who can extrapolate everything without second thought. People in the third world can’t read, so it must be easier for them to talk to a smart device, right? Well, if the voice recognition only understands English and a couple of other languages, the device will be equally useless in their hands. It’s rather a vicious circle: without better education, people in less developed countries won’t be able to read or speak foreign languages and thus have access to better technology and tools. On the other hand, voice recognition is very expensive, both as development costs and as system resources for running live on devices, so big tech companies are unlikely to invest in voice recognition for exotic languages and for small countries with little chances of future revenue. Or is the author proposing that every country should abandon its cultural heritage and just adopt English?

Some of the recognition algorithms are not even running on the device, but require access to servers and a relatively fast connection, so that would make the ‘Voice UI’ next to unusable in remote areas with poor coverage or even in cities for people who can’t afford first-class Internet. A much better solution would be a gestures-and-symbols driven interface that can run on low-powered devices and be taught quickly to people unfamiliar with technology.

As it stands now, Siri supports about a dozen languages, Google Now less than 10 (surprising, I would have expected to be ahead of in this area – Google Voice Search partially supports around 50 languages though) and Cortana about six.

Post a Comment