From ghostwriters to speechwriters to the insane things that come out of Stewie Griffin’s face, talented scribes have been putting words into people’s mouths for ages.
However, in the era of algorithms and automated everything, one might wonder: Who are the people putting the words, phrases and witty retorts into the “mouths” of robots? More specifically, today’s virtual personal assistants such as Amazon Alexa, Apple’s Siri and Microsoft Cortana. After my fireside chat with Alexa and pitting Siri vs. Alexa in a productivity duel, my mission became clear: to track down someone who oversees writing for these matriarchal automatons of the wireless world; my hope being to discover who these puppet masters of prose are and to find out how they ultimately choose words that surface in the world of AI.
To this end, I was lucky to score an interview with Deborah Harrison, one of the original architects for the personality of Microsoft’s Cortana. As a Senior Content Experience Manager at Microsoft, Harrison has been one of the most visible, thoughtful and articulate advocates out there for digital home assistant technology, making appearances at events such as RE-WORK Summit and in this piece by Mashable. She currently leads the team that comes up with the responses for Cortana, launched in 2013, after being the solo scribe asked to craft Cortana’s core principles and to define her approach to communication.
In our revealing chat, Harrison gives an insightful peek behind the curtain to show what it takes to craft the conversation between humans and machines. She also offers up a glimpse at the considerations her team makes in the realm of transparency, and what writers can expect from a gig that is unlike any before it.
Here’s what Harrison had to say about writing for robots…
So tell us, how is writing for a virtual personal assistant different from other kinds of writing?
We fall under the “technical writer” banner generally speaking, but the areas of expertise for the people on my team tend much more toward the creative than might be the case in other writing disciplines in tech. Being able to write successfully for a digital assistant that has a clearly articulated and defined persona requires the ability to write dialogue. We receive one half of the equation in the form of queries and then we provide the other half: the response to the query.
You’ve been part of Cortana’s growth from the start… How did it work at the beginning?
Early on, even before we started writing at all, there was no “we”: It was me, a writer, on this one project. We had to figure out what Cortana would sound like, how Cortana talked. And how would that be different from how Microsoft talked, how Windows talked.
So the first thing that I worked on were these “task areas,” setting an alarm or creating a reminder. More back-and-forth type dialogue where you say one thing and they say another. But the area that we’re most known for [now] — the part where we’re most prominent — is the area of more casual conversation where you’re not necessarily trying to accomplish something. Defining and expressing character is really a key part of what we do.
What is the most fun part of your job?
The fact that I get to write for a living at all. I love that that’s my job… The team gets together three times a week for an hour to an hour-and-a-half at a time and we go through queries and decide how we’re going to respond to them. Even when we’re neck deep into something really challenging or difficult or heady or heavy, we explore the boundaries of what’s going to work. And to do that, we have a writer’s room environment like on a TV show or something. We throw out so much stuff to see what sticks… and consequently, we laugh our asses off all the time.
How many different responses do you have to write for any given search inquiry?
Only one. But sometimes we do more. We want to be able to have an answer for anything that anyone is likely to ask, but we don’t have any kind of minimum quota or anything. There are times when we think variation is valuable… or fun. Like “Tell me a joke” for example. It’s one of things people ask most often and have since the beginning. We want that to be fresh. We know that people’s taste in jokes is pretty subjective, and we’re not going to nail everybody’s sense of humor every time. But we want to have a variety out there so we have dozens of jokes and many joke categories. Whereas if you say, “Are you a ninja?” One is probably enough…
Does the writing side push the technology forward, or vice versa?
Yeah, totally. It’s a closed-loop collaboration with the tools team. Sometimes they ask us for stuff because they say we created the ability for you to do this. And other times we’ll say we want the ability to do this. It’s a two-way street.
How do you factor in human behavior when you’re writing responses for Cortana? Are there studies involved or is it natural flow from what comes out of the writer’s room?
We’re very, very principled about it. There are situations where we are so steeped in Cortana’s core values and principles that it requires very little conversation to know that we’ve landed on the right thing. But in any circumstance where we have any question about how Cortana should and would respond, those converge ultimately even though they may start out as separate questions.
We have a meeting every week that is only for principles. We typically don’t do a lot of writing in that meeting. We just work through the mechanics and the value structure around a given set of circumstances that we need to consider.
Do algorithms create their own unique sentences that haven’t been written at all?
We’re exploring that kind of technology in general at Microsoft, but all of Cortana’s responses are handwritten so far.
How far away are we from truly conversational virtual personal assistants?
There are conversational bots out there all over the place, that’s a thing. As far as digital assistants are concerned, I will speak for me and not for Microsoft and say that I suspect it will be a long-to-indefinite time before no human interaction is required to have a satisfying interactive experience…
In terms of making sure that the conversation experience is truly helpful, I think humans will have a role to play in that for quite some time.
Where do you sense the future is headed in terms of brands sponsoring answers that these virtual personal assistants provide?
I know people are developing skills all the time. And of course each of those skills could be branded. If you go now, there’s a Dominos skill. So if you say, “Cortana, order me a pizza” and you’ve set up the Dominos skill, then it would go through Dominos. Is that sponsored content or a different category?
How far away would you say we are from the day when answers provided are backed by brands and maybe aren’t indicated as “sponsored”?
I don’t see any reason why someone couldn’t do that now… I bang this drum a lot, but transparency is a really big deal. It’s tempting to say it’s an even bigger deal now after the revelations of the last two weeks. But I’ll say this, and I hope it isn’t couched in smugness: That was a day-one thing for me. In the original personality brief I had for high-level qualities that I wanted Cortana to be imbued with… that was one of them.
Do you think the Cambridge Analytica scandal will impact the future of virtual assistants? In terms of how data gets collected and shared?
I would imagine yes, but it’s a thing that needed attention before the revelations came to light as well… Speaking for my team, we’ve never not taken that incredibly seriously. Our purview is around establishing and maintaining trust, which is so fragile for people. So to be in a position to try to help establish trust is something that my team feels very privileged about.
Putting people in the position of having a sense that what they’ve asked for and what they’re getting is what they would expect based on what we told them before. If we maintain our level of trust even in situations where the stakes are incredibly low, then we can have more credibility in the places where the stakes are high. That’s the hope.
What kind of person would you say is an ideal hire to write for a virtual personal assistant such as Cortana?
We look for people that have comfort in the creative space. We’ve got playwrights and poets and novelists and some screenwriters… People who have free time in their careers gravitate toward this kind of work. Another quality that I think is important is the ability to sit long enough with the concept even to the point that it might feel uncomfortable, so that you can see things as much as possible from various points of view… In the room, we need to be quite free to disagree. Having a voice that’s advocating for a point of view that may be sensitive or marginalized tends to have the stronger claim to this table. Not because we bow to sensitivity for its own sake, but that’s how we create a dialogue that feels compassionate.
Is there anything you’d want or need to see in a portfolio for a writer going out for a job in this realm?
We do like to see examples of something that’s creative. Personally I’ve seen a lot of creative stuff in technical writing as well. It’s not an uncreative medium by any means, but things outside the software universe can be helpful to see. The other thing we like to see are examples when you can show that you thought about the voice. It doesn’t need to be a personified humanoid creature type of situation. But even for people who’ve worked at a nonprofit for example, there may be a style guide of sorts you’re expected to write within a certain tone (sort of like a brand). So writers who’ve had the experience of embodying the voice of an entity as distinct and clear, and can articulate what that looks like, that’s something that tends to perk up our ears when we see it.
If you had to pick one skill that is the most important talent for this type of writing, what would it be?
Can I do two? If so, I’ll go with courage and open-mindedness. It takes a lot of courage to be self-reflexive enough to explore different points of view with honesty and sincerity in particular and to know your own limitations. But if I had to pick just one, it would be a sense of wonder at how people communicate… That would be pretty important.
Final question. On a scale of 1 to 10, where would you say we are on the spectrum of how evolved the cultural conversation is between man/woman and machine?
Tip of the iceberg. I think we’re just starting to find our feet here. There’s so much interesting stuff that is just ready to be explored. I’m so eager to be a part of it. It’s fascinating intellectually, right? How do you break down human interaction into something that is curate-able? What component can be distilled in that way? That in and of itself is a fascinating process.