Experiences with designing for voice interactions

Joeri Van Cauteren
6 min readOct 21, 2020

Interest in voice-driven devices and voice interfaces has evidently increased over the last few years, with many companies producing their idea of a virtual assistant. Most recently, Apple announced the launch of its new HomePod mini — a smaller version of the original HomePod with a smaller price.

A voice experience differs from traditional user experience as we’re invited to communicate and vocally interact with a non-human. It’s a huge shift from traditional human interaction and requires a lot of effort from the device. We’ve been experimenting with voice for quite some time now and purchased a Google Nest Mini a while back to tamper with. Our exploits with it made us wonder what we could do with voice assistants and what the future of voice experience could be.

The bad press

A quick internet search will reveal multiple news reports and horror stories about companies listening in on people’s lives through voice-driven devices. These devices are allegedly recording more than they should and have sparked paranoia over the infringement of privacy. As writer Adam Clark Estes puts it: “By buying a smart speaker, you’re effectively paying money to let a huge tech company surveil you.”

Everyone knows someone who’s had a similar following experience:

“Last week, I was at [someone’s] house and they have a [voice assistant]. We were talking about travelling to [a very nice country] but the flights were too expensive. Suddenly, I received a [email/telephone call/message/advertisement] offering discounted flights to [destination], which I of course disregarded and ignored. As discussion continued, I mentioned that I would book a flight if the amount was [a specific amount] or less. Suddenly, I received a new [email/telephone call/message/advertisement] offering a flight at my desired price. Freaky. We hadn’t searched anything online. We’d only talked about it.”

It’s usually worth questioning the credibility of these stories since, more often than not, people simply fail to recall that they did perform a quick search on the specific topic a while back. Especially in a context with friends, it’s normal to look up similar interests that have come up in conversation, which leaves a record. Whether it’s true or not, such stories frighten people.

Despite the bad press and horror stories, a steadily increasing market for voice assistants remains.

Our experience

So, what about our exploits with voice experience? We’ve tried a lot: ordering things, checking prices, annoying other people, asking a voice assistant to read a dictionary and then leaving the room… You name it, we’ve probably tried it.

After testing quite an extensive variety of activities and functions with voice assistants, we’ve noticed a few things:

1. Humans are inherently bad at listening

Humans have a short attention span, especially when it comes to listening. Long answers from voice assistants are lost on us and inevitably useless. Short and simple answers work best.

2. Voice is only a nice gimmick

Voice assistant devices are marketed to the masses as a life-changing device that can turn your home into a futuristic voice-controlled paradise. In reality, it’s a gimmick that allows you to show off in front of your guests for a few fleeting moments. Although it’s a feature that can add to traditional experiences, voice will seldom live up to expectations in a standalone experience.

3. An inadequate ecosystem

Ideally, a smart home ecosystem can be created by placing and using multiple voice assistant devices in various parts of a home. However, these virtual assistants often fail to support you with everything as they’re usually incompatible with each other. Many of these devices tend to run on different platforms, which is rather annoying since the one thing you need might not be connected to the voice assistant.

4. Long numbers don’t work

In general, people are terrible at remembering numbers. Although systems have successfully trained people to remember more and longer numbers, these people are exceptions and not the average user. In addition, the written and verbal usage of numbers usually differ from each other. Furthermore, a lot of languages don’t read numbers from left to right. In Dutch and German for example, the number 98 is not written or spoken as ninety-eight but eight-ninety. These discrepancies and varieties often confuse voice assistant devices and using long numbers simply doesn’t work.

5. They’re not for financial services

Voice assistants aren’t for financial services. Have you ever trusted your voice assistant with financial instructions or information? Ever asked what your current account balance is in the midst of a workout? We haven’t either. Most of us don’t and won’t. Furthermore, following the previous point, giving financial instructions with long account numbers almost always ends up destroying the device. If someone could be convicted with attempted murder on voice assistants, we would’ve been in a lot of trouble.

6. Ideal for news and sports

Voice assistants are pretty ideal for catching up on news and sports. The information we receive is short, simple and we’re tuned into what the voice is saying because we’re interested. In the chaotic modern world where there never seems to be enough time or hands, spending the time and effort to actually read news headlines and sports articles just doesn’t seem feasible. Listening is an easy and very welcome alternative. In addition, many of us have the habit of listening to news and sports from the radio or podcasts, so this experience feels familiar.

Our advice

If you’re considering working with or designing a voice experience, our advice is:

1. Question its necessity

Ask yourself: do we really need it or is it just a gimmick? If the answer is the latter, focus on other aspects that are more important first before turning your attention to this fun feature.

2. Adapt to your target audience

Check who your target audience is and adapt to them. Consider the characteristics of users that are likely to use it most frequently: age group, attitude, use of language.

3. Work around numbers and complicated instructions

As mentioned above, numbers and complex instructions are quite a challenge for voice assistants. So avoid them or chunk them up. Design the voice assistant so that it asks for step-by-step confirmation on a device that users can read and physically interact with (e.g. smartwatches, phones, tablets). If this is beyond the voice’s capability, ensure that it can hand-off the complex instructions to another device.

4. Consider your ecosystem

Watch out for the voice’s compatibility with other platforms and devices. By designing it so that it supports as many varieties as possible, users can seamlessly utilise the voice assistant regardless of the differences between devices.

5. Protect user privacy

Be clear and obvious about how user privacy is protected even if they provide sensitive information. Voice experiences differ from other traditional ones since they can ask users to confirm sensitive information. By adding this friction, the user begins to trust that you haven’t been listening in. Whether this is actually true or not is another matter we won’t address here.

Don’t listen in unless it’s been made clear to the user and they have given explicit consent!

6. Avoid critical decisions

We don’t recommend using voice assistants for critical decisions unless it’s been specifically designed with a particular intention. In the worst-case scenario, a voice assistant’s misinterpretation of instructions can result in death. That’s why humans pilot airplanes and there are so many different buttons on the control panel.

7. Prototype, validate and iterate

This is the only way to develop a voice assistant that is used frequently and actually matters.

Conclusion

Voice experience has definitely found its place in our ecosystem of interaction. However, it’s not an interaction that lends itself to an exact copy of what we are accustomed to designing. Voice experience requires a tailored and considered approach for the medium itself. By taking that into account, voice interactions can be taken to a whole new level. We hope our experience and advice can help you to design a voice experience that truly engages users and adds value.

--

--

Joeri Van Cauteren
Joeri Van Cauteren

Written by Joeri Van Cauteren

Builder, strategist, innovator, entrepreneur, husband and father.

No responses yet