When a robot rings your phone, you can usually tell right away. Its voice is melodic, it rarely stumbles, and it’s unnaturally efficient. The voice betrays its origin before it even has the chance to tell you that you qualify for a free loan, your mortgage payment is overdue, or that your input would really be valuable for a customer survey. Knowing it’s a robot also makes it easy to hang up.
The minds behind Google Duplex are in the process of changing that paradigm, for better or worse. Announced Tuesday at Google I/O, the company’s annual developer conference, Duplex is new technology that enables Google’s machine intelligence–powered virtual assistant to conduct a natural conversation with a human over the phone, mimicking the chit-chattiness of human speech as it completes simple real-world tasks.
It was shown off during the keynote event, and even though the onstage demo was prerecorded, seeing and hearing the concept in action floored the audience. In the first demo, a woman calls a hair salon, where another woman answers the phone; the two go back and forth for approximately a minute before they figure out a time that works for a hair appointment. In the second demo, also about a minute, a man calls a restaurant to book a reservation; the woman on the receiving end has a heavy accent and isn’t offering the best information, so the caller pivots to make a new request.
The big reveal was that neither of the voices who initiated the calls belonged to a human. They were bots, dispatched through Google Assistant and activated through a back-end system. But they sounded human: They said “Um” and “Ohh, I gotcha” and ended query statements with the raised pitch of a question mark. And, for the purpose of the demo, they completed tasks that normally fall to us mere mortals, whether than meant making a hair appointment or determining whether it would be better to just walk into a restaurant and take a gamble on a table.
For Google, Duplex marks the next big step in natural-sounding, fully-autonomous robot conversations. For the rest of us, it straddles a fine line between being enormously convenient and eerily deceptive. Google still hasn’t launched this feature, which will work in Assistant on phones and compatible smart speakers. The company plans to begin testing Duplex publicly this summer. In the meantime, there are at least a few features it needs to consider, including how the Assistant will announce itself to unsuspecting humans on the other end.
Mr. Roboto Calling
Duplex was first launched as an experiment several years ago, Google says, and was started by principal engineer Yaniv Leviathan and Yossi Matias, vice president of engineering. (One person within the company indicated that it started as a 20-percent project, though a Google spokesperson declined to say whether it fell within those parameters.) Duplex brings together natural language processing, deep learning, and text-to-speech technology into one service. The part that resonates most, though, is the “natural” bit—the engineers have trained the Duplex model to match expectations around latency, like pauses after someone says “Hello?”, and to change intonation depending on how the conversation flows. In other words, to react the way humans do when speaking on the phone.
It’s a reversal of the familiar bot dynamic of a human calling a vendor, like a bank, and having to deal with a computer on the other end.
“Usually when people talk to a computer, they have a goal and they’re basically willing to do it the computer’s way,” says Alexander Rudnicky, who researches human-computer speech interaction at Carnegie Mellon University. “In this way, it’s turning it around. It’s a computer going out and trying to convince a human they should try to talk to them.”
“The technology is remarkable,” says John Havens, executive director of the IEEE Global Initiative on Ethics of Autonomous and Intelligent Systems. “But I showed [the demo video] to my wife and she said, ‘Which one’s real?’ And there lies the rub.”
One of the things that was obviously missing from the Google I/O demo was any sort of announcement on the part of the Google Assistant that it was, in fact, a virtual assistant and not a human. The phone calls from Duplex will originate from Google’s back-end system, not from your own phone number, which might be on file with some of the businesses or services where you’re a regular customer. And that’s where problems arise, Rudnicky says. “It should be saying, ‘This is so-and-so’s Google Assistant,’ or something else that clearly identifies it as a machine, assistant, or human,” he says.
Google did not immediately respond to WIRED’s questions on how this tip-off will work, but did tell CNET the Assistant will “likely tell the person on the other end of the line that he or she is talking to a digital personal assistant.” It didn’t offer details about exactly how that will happen.
When dealing with bot phone calls, there’s not only the question of ethics, but of etiquette. A recording of the Google Assistant-made call won’t be available to you, the human, after it’s been placed, so there’s no way of knowing if the automated call went off the rails in some way. (Maybe it becomes apparent when your hair salon or favorite restaurant suddenly blacklists you.)
And what happens if humans start to outsource their most uncomfortable calls to Google Assistant? Right now, Google says its is limiting the Duplex technology to very specific domains, but Havens sees potential for expansion of those limits. “Pretty soon it’s not going to be hard for someone to type in the words to have a virtual assistant break up with their boyfriend,” he says. “Or, ‘call my elderly mom this weekend.’ I’m being a little hyperbolic, but we’re actually here.”
And, any kind of automated call system could lend itself to abuse. It’s easy to imagine someone trying to program the Assistant to spam-call a business, for example. Google said it’s ensuring that one single user won’t be able to place more than a certain number of calls per day, nor will they be allowed to make several calls to the same business, though the company declined to say what that call limit is. It also said it’s “looking at patterns” to spot anything spammy.
It’s clear there are still many unanswered questions about Duplex and how it will work, questions that even the most sophisticated of virtual assistants can’t answer just yet. But Rudnicky points out that these kinds of systems have been in the works for more than a decade in verticals like health care, and that there are benefits in this kind of on-the-fly customization from an AI. “[It] can have a much broader interaction with the person,” he says.
Rudnicky also says that, broadly speaking, technology like this is something people are only going to become more aware of. He cites an anecdote of when ATMs were first introduced and he observed someone talking to one as though it were a replacement for a human bank teller.
“If you don’t know the way it works just yet, you react to it in ways that you’re used to, like it’s a person,” he says. “But I bet you the very same person today would have no problem dealing with an ATM. We just assimilate this interaction into our culture. And I think the same thing’s going to happen with this stuff.”