Language Matters: How to Speak to Your Virtual Personal Assistant

It’s 2018 and virtual assistants are in full effect. It started with Apple’s Siri, but a few years later, now there’s Amazon’s Alexa, and Google’s, well, Google, and for those of us who survived all the upgrades to make Windows 10 work, there’s Microsoft’s Cortana. Recognisable to most, these intelligent personal assistants have practically become regular residents in many of our homes, and for some, even a potential date.

GIF via Giphy

But how do you speak to your Amazon Echo, Google Home or Sonos One? And what is the technology behind these devices that makes the rise of intelligent personal assistants even possible? Let’s take a look!

Ah, the acronyms

Because behind each great piece of technology, there is always an acronym or TLA (three letter abbreviation) to learn. Here are the basics: NLP, or Natural Language Processing, which forms the crux of what Siri, et al, are all about, involves a combination of different aspects, depending on what you want to do—which is where some of the other acronyms come in.

First, here’s an old faithful that we all know: AI — artificial intelligence, which is the theory and development of computer systems that can perform tasks like speech recognition, language translation, decision-making, and visual perception, in place of an actual human being.

There’s NLG — Natural Language Generation — which allows us to speak to those assistants on which so many of us have come to depend. Technically Alexa isn’t speaking to you; she’s software that produces narratives and reports in easy-to-read language spoken back to you, but it doesn’t feel like that.

There’s also NLU — Natural Language Understanding — which, again, is fairly important when what we’re trying to do with our machines is communicate. NLU is software that extracts information from written text, such as that we see in text mining and text analytics. And getting past the acronyms of NLP, we also have: speech recognition — software that understands or transcribes spoken language, and speech synthesis — software that speaks, or reads out text. Siri and her friends all use a combination of these to give us the answers that we need; sometimes even to talk amongst themselves.

Photo via Twitter

A user interface

A UI is really all that language is: it is the tool we use that connects our thoughts with our actions, how we express what’s going on in that grey matter residing inside our skulls — more often than not, to get us what we want. Babies cry so that their parents know they need something, even if that something isn’t always clear; the way we speak to our phones, tablets, and laptops can feel fairly similar at times.

Learning a new language? Check out our free placement test to see how your level measures up!

Because no one really thinks too hard when they’re asking Google to play their doing adulting things playlists when they’re paying their bills, we should perhaps have a reminder of what it used to be like trying to get the things we wanted from our computers, to see how far we’ve really come. Back in the late 1990s, Ask Jeeves became the search engine of choice for many, because a user could ask real questions — the, and, if, and everything — without having to type in any particular jargon. It made it feel like you were speaking to an actual person — in this case, an exceptionally intelligent butler — without having to think too hard about how to phrase the question. In a period when how to pages were filled with suggestions on the best language to use in search engines to get the answers you really wanted, Ask Jeeves was ahead of its game.

GIF via Giphy

Algorithms

These are everything. Because really, that’s all these devices are doing; running a sequence of algorithms to give us our answers. The NLG systems we rely on act as a router that understands what information we seek and delivers it to those who are seeking it — it’s like having your very own team of data scientists sifting through detailed information so you don’t have to. The better and more tightly refined these algorithms become, along with everything else that’s needed to understand the average human shouting at their screen, the more it feels like we’re having a real, two way conversation with our devices.

GIF via Giphy

To the future?

Virtual doctors that we can speak to, who will diagnose our ailments and stop us Googling our symptoms to make sure we’re not dying? Pocket-sized personal shoppers who will absolutely never let us leave our homes without getting our outfits just right? Who knows; the advancement of NLP and NLG technology means it feels like nothing might be impossible.