Look around any office in most large companies. If you spontaneously detonated all of the Apple-manufactured computing equipment in a 100 yard radius, there is little chance of any damage being done. Loosen your definition to include smartphones, and maybe you will hear a few pops, although it’s probably a personal iPhone while the company Blackberry remains unscathed.
It’s true that the iPhone has made inroads into the corporate marketplace, but the Blackberry still has its evil claws deep into the IT psyche, and in certain fields like Financial Services, where voice recording of mobile phones is a requirement, you won’t see a single one.
So how can I justify my bold title about Apple’s new-found ability to morph the corporate world? It all comes down to Siri. Siri, you will almost certainly know, is the voice assistant built into iOS7, which allows you to shout annoying commands at your phone, and in many cases get some useful action: “Text my wife and tell her I am running late” is a useful 10 second time saver while running for a bus.
It’s what we call in the trade a “command and control” system, albeit a very sophisticated one. As a designer you try to limit the types of things people can say, and what they can say, and hopefully, how they say it (enunciating like a BBC presenter has been known to help). A basic version of this has been built into the iPhone itself for a while (known a Voice Control), where the system looks for a simple first command (eg “Play Album”), and then knows that the next phrase has to be limited to a small vocabulary, ie one of the albums on the device. The fewer albums, the more accurate it will be.
To take account of the more sophisticated nature of the requests, and the larger potential vocabulary, the iPhone does not have the capacity or processing power to deal with the voice recognition and subsequent request, hence the reason you need Internet connectivity to use Siri. The voice commands are sent to Siri Central, and either a set of search results are returned, or a set of commands for the phone to act upon.
On balance, I find it bloody irritating shouting into my phone. Although probably not as annoying as I find it when other people shout into their phone. Siri undoubtedly has its place (and just in accessibility terms alone, I think Siri is an amazing achievement), but in its current form, it’s just a step along the way to the “always on” phone that you can, in effect, have a conversation with, and which is always available to use (the Moto X is the first phone to feature this) . It very much crosses over into the Glasses concept where you have a display that you can always see, so you need a Human Interface Device that can be driven entirely by voice.
Siri is not, as a tool, changing the corporate environment one jot. What has changed, though, is suddenly people are waking up to the fact that voice can be a useful tool. It certainly hasn’t reawakened a demand for voice dictation à la Dragon Dictate, but it has opened users’ eyes to the possibilities of what voice could do.
I only notice this because I spend my entire life talking to people about taking phone calls and turning them into some sort of meaningful text. You always have a number of hurdles to get over when approaching new customers, and a major one used to be that it was not technologically possible to get a computer to understand what is being said to a telephone. Since people have seen and heard about Siri, that objection is heard distinctly less often. Often people refer to Siri when I am giving presentations, citing it as an example of what I am explaining (Thank you Steve Jobs).
All of a sudden, voice is seen as a rich source of data, in the same way we now look at email, as opposed to something ephemeral that disappears once it is said (or worse, which is stored in a pre-historic call recording system which has to be resurrected at great cost if a dispute arises). But in ten years, every phone call we make will be stored for us as text, almost instantaneously, and kept alongside our emails
The irony is that the technology that powers Siri is poles apart from that which transcribes telephone calls. While the basic maths is the same, pretty much every other element of the system is different, starting with speech quality (Telephone speech is recorded in a 1972-era codec), and moving all the way up to the size of the vocabulary and the way people speak.
But I am not going to let that deter me. Hats off to Apple (so long as they don’t come and try to take a bit out of my market…)