Ten years ago, the most exciting thing about speech technology was the ability to dictate text without the need of a secretary to transcribe it. But even then, the results could be varied. You undertook the task fully aware that you’d spend as much time proofing and editing as if you’d just typed the document in the first place. And if you happened to have any kind of accent, you really hadn’t a hope without significant training.
Since then, voice technology has become more mature and more mainstream, but it is still relatively limited in its application and its availability to ordinary developers. We have seen vast improvements in accuracy coming out of research done by the likes of Facebook and Google, but the cost for a developer to access more than small amounts of transcription remains very high.
And, crucially, it is not always possible to manage the security of speech that is being processed in the cloud:. This has even led to some companies, like Google, offering differential pricing if you allow them to use your data for training.But as a consumer, how much protection would this offer you if a cash-strapped start-up used the cheapest option without your knowledge?
So, what can we expect to see from speech tech in the short to medium term? And why does security matter?
The potential for the application of speech technology has broadened significantly in the past few years, particularly in the wake of the COVID-19 pandemic. Here are three key areas likely to experience the most significant uptake of the tech in the short to medium term: