Event Recording

Mark Stephen Meadows - That Robot Overlord Is in Your Kitchen. Her Name's Alexa

If we look under Alexa’s hood and read between the technologies we find a disturbing reflection of our own identities and personal data. In your home Alexa is always listening and influencing your options. In your company’s product deployment Alexa is influencing your brand, your customers, and your user data.  We will discuss why this represents a geo-political shift more significant than the rise of social media. As a previous developer of Alexa skills and other AI systems I will share with you my lessons learned.

And we will examine alternatives.

Good evening and thank you for coming. Hi Ruben. It's great to see some familiar faces. I'm so excited to come back here. This is an amazing conference. That's got so much going on with so many different aspects of identity being explored and this really great community with all these new perspectives. So I'm gonna be talking a little bit about Alexa and more largely I guess, surveillance. And the dog that we have in this fight is that at Botanic technologies, we build chat bots, excuse me, I adjust this. We build chatbots, conversational assistance, like voice assistant, like Alexa, and also real time 3d avatars that you can talk with that can see you, that can read your expressions in your emotion, measure them. And if we could turn sorry about this guys, I hate what I'm breathing like this there. And what we discovered in particular with health and wellness use cases is that the amount of data that we're collecting from the end user is vast.
And the surveillance of these systems in the case of, for example, health and wellness is certainly expected and beneficial because you want to be watched, but the prerogatives of a helicopter mother or of a tyrant state may not be what we want to have in all of our homes. All of the time at Botanic, as we develop these systems, we've discovered more and more that people do not recognize how surveillance these systems are. And even when we go and we look at an E L a or a terms of service document, we don't get all the information.
So it was last November. And I was in Barcelona at this AI conference, and we had the pleasure of running into these guys from Intel. And they had developed the Sue Creek processor. The Sue Creek processor is a processor. That's managing all eight microphones of the Alexa device. And so we went up and we asked them guys what really is going on with this thing, cuz we know that the wake word is working. I'll come back to that in a second. But really how much of the data that the system's picking up is being processed and sent to Amazon. We'll never forget this one engineer, really round guy with really sharp eyes lean forward. And he said all of it, all of it, he was delighted. And it turns out that actually according to him, there's a Delta of control that Amazon has, which is to say that the system can determine whether or not it's sending one second or 60 seconds and that Delta can be changed, but it's always sending data up. Well, this of course like made my head explode because I'm thinking that's not what they've been telling us. And even on the privacy notice and the echo privacy page on page two, the question is Alexa recording all my conversations, no by default echo devices are designed to detect only your chosen wake word.
Well, Amazon, it's very hard right now to get away from the system. Very similar to Google and Facebook. It's almost like a, a thing that's whether it's a meme or a gene that feels like I'm always being somehow touched by it 175 billion marketplace, 5% of all online purchases, 49% of the American eCommerce industry and 6.3 million echo devices. So let's start with the wake word similar to a key that you would use to unlock a door awake word needs to be distinct from other commonly spoken words. It's best. If it's two or three consonants, it needs to have a Glo to stop and it needs to end in a vowel. These are common characteristics. So we can see in things like Alexa, Siri, Cortana, BSB, these aren't words that you would normally or sounds you would normally hear. So the idea is that with a wake word, the system begins recording.
When it begins recording, there are five API services that are used. Now forgive me while we bur down into this rabbit hole, but it's important one because what happens when I say what's the weather in Munich that sound Russell road gets compared to a library. The sound is then broken down into particular words. What's the weather that second is a natural language understanding or an intent recognition of the text. We then have kind of regressed into what's a chat bot that text is then sent up to the weather service. The weather service returns and says partly cloudy. And then the Alexa responds with a text, a speech API call that says it's probably slim. And I can understand it
Takes, you know, 200, 300 milliseconds depending on the network speed you have. So what really is Alexa doing with our data? At this point, we have, Alexa's been sending chats from one person to another person. It's been recording American murder mysteries in the bathroom. It's been recording full conversations between couples as they talk about their wood floors and finances. And there's a really great one. And I, I went to some effort here on the left, left CT and HEA did this great article on how there was a guy here in Germany. He didn't even have an Alexa or a dot. And by coincidence, he discovered that there were seven, well it's 17,000, excuse me, wave files that had been sent to another person that did own an echo. So in that same way that he was being touched in a way that there was something about the system that even though we didn't own it, he still had to participate in it. It was this random, like rampant collection of, of information. Well, he contacted Amazon and said, wait a minute, what are you doing with my data? Where do you get this in the first place? He didn't answer back. CT did the article on it. They posted it on Reuters three days later, then Amazon responded to this GDPR complaint. And as an apology, they sent him an Alexa.
Yeah, but there's something about the wake word, which is curious because if you say it, then that means it's gonna be listening at some point in order for it to just recognize the wake word. It has to be processing. Come back to this local problem in a minute. So this is in London. You can now get these Amazons for free. If you sign up for a health program, you're exchanging your data for the convenience of using this multimillion dollar AI system. Now, if somebody comes up and they offer you, let's say it's drugs and they're gonna give you the drugs for free. You can just have, 'em just use 'em. I think you should ask you yourself, whether that's an appropriate deal to enter into. Well, sure. There might be benefits, health benefits. There might be side effects, but ultimately, what is the motivation of that individual in offering you those drugs?
I think that we need to look at the same problem through a similar lens with a multimillion dollar AI system. There's a lot of data that Alexa collects. If we think of privacy and publicity, a bit like an onion skin, we can understand why Alexa's recording device is right in the middle of our home, outside it's public on the front of my street. Let's say I live in like a, a nice little suburban neighborhood here, just down the street. And in front there's a road and people drive through and cars are there and it's public. And I expect maybe some kids will come into the yard and that's not a big deal. I'll throw a shoe at 'em or something. And if there's somebody on the porch, I expect to interact directly with them. And if they come into the foyer or into my living room, well then I wanna know who they are.
And if I find them in my bed, then there's something very weird going on. Because at that point they've gotten into the very center of my home. What is at the middle is what's most private. And so by putting microphones that listen to us in the middle of our homes, Amazon's doing a fantastic job of collecting the very most valuable and most private data log files, words, place, and time. The face is now being included. Cuz the new versions of Alexa come with the camera affect detection. If I'm like, yay. They can tell, I feel good. I'm like that also tells how I feel about products, services, and goods, because ultimately that's what Alexa's providing their job. Isn't just to provide those services and goods. But to know what I want and to be able to anticipate even what I'm gonna want later, hence the behavior with the app and the way that we speak.
So what do these guys in Barcelona say? It was like, wait a minute, this can't be true. Now I confess is the first time I've given this talk tonight and I would welcome any of you to please prove us wrong. We went to Intel's GitHub. We looked through the repository, we looked all the way down. The code we, we identified this particular enable if the enable is turned on, then that means that if you say Alexa, the way word is functioning. But if by default it is left off, which is the way that it's shipped. It's collecting everything. So here we are looking at the lines of code that determine this on the SU Creek processor, GitHub repository. It's listening all the time. Now, why don't they just process it locally? Because there are systems like snips air chips that allow you to basically do limited amounts of processing locally. And then you can wake and then you can send the data. So they had an option and it looked as though from the different forks that had been provided in the GitHub account that at one point it was provided, but then later they just took the enable off.
So the problem is that all of the data that we say is valuable to Alexis and Amazon in particular, there are many business models and it seems to me as though they're willing to take legal risk and have their hands slapped and say, we're gonna completely go against GDPR policies. We don't care if we're non compliant because the value of the data that we're collecting merits of 47 million bill or whatever they're gonna get. They also, I think have a problem because profits versus privacy is now one of the key challenges that these executives need to make. And in making these decisions, do we leave this enabled by default off? Well, if we're processing it all locally and we only give the data that is directly relevant to asking the weather or playing a song, by the way 85% of Alexa use is just to play a music. But that goes against the web services model. The ecosystem of Alexa, skilled developers, what they do is they basically set it up so that API services are running back and forth. Those web services are great for Amazon web services business model. So the more they restrict the processing calls that are made, the more they're reducing their abilities to actually earn revenues in the long term economy they're trying to build. So these are executives have to say, all right, are we going to respect people's privacy? Or are we going to try to make money?
Now, Google, by the way is doing this same damn thing. And they were slapped to the 57 million fine a couple of months ago because the nest thermostat that Google had been shipping just kind of happened to have a microphone built into it like P like they said, oh, sorry, this is a mistake. Some engineers made a mistake, sorry, but you can't manufacture thermostat with a microphone on it and ship it. If just like a couple of random engineers are making a mistake. So this term surveillance capitalism is being pushed around a lot. And I'm glad to see that as somebody that professionally grew up in San Francisco, I'm very much opposed to a lot of the rampant collection of data that's going on and the invasion of our privacy.
So Alexa seems to me to be an evolved version of Facebook in this Silicon valley of American capitalism. We're seeing a system now, which is not only just collecting all of our personal data, but as developers join the skills groups and deploy their skills for the Amazon ecosystem, Amazon has an ability to see all of the data and all of the machine learning models that they're using. And as soon as one of those things becomes really aggressive or grows or offers any kind of a threat to Alexa, they can either squash it or acquire it. And so it seems as though not only for those of us at home that might have an Alexa that we're trading convenience for privacy, but it also seems as though now social media is gonna be eclipse because now it's not just gonna be individuals following other individuals and then systems like deep face running, but we're looking at a much deeper level of artificial intelligence.
That'll be assessing our every word. So the problem with this though, and the overlord joke has to do with the fact that if Alexa's successful and if it's able to not only allow us to order pizzas and taxis, but also interface with our news and with our appliances and with other members or our family by scheduling things. Then that means that there's basically one pipeline of all communication as though that single pipeline of communication that becomes so precious. Isn't only the means of determining what you want and providing it to you. It's also the means of selling and buying everything. But there's another side to this too, which I find slightly stranger with about 20 hours of your voice data. We can use a convolution neural net to make a bot that uses the Texas speech version, the library. That sounds just like your voice. So certainly Amazon now has 20 hours of most people's voices. I don't think that Amazon will go and make a version of me to call my wife and try to have our credit card sent. They're gonna already have the credit card information, but that data does still reside on a server and Alexa skills. Developers can still build these systems that are able to collect that vocal data. Somebody else can use those.
So we provide the tools to deploy conversational AI characters and these chat bots and voice assistance and conversational avatars are highly surveillance. We are trying our best to make sure that these systems have license plates, bots need to be authenticated. And we need to know based on open source available code, whether they're respecting our privacy or not. We also need to make sure that these systems are GDPR compliant, AML compliant, and all of the different range of responsible characteristics that we now see so many different government agencies caring forward. We also, as developers need to make sure the, we are hearing too. So that is why we have a dog in this fight. We're trying to take a sharp line and provide the best conversational systems we can. But as we journey down that road, we're also noticing that there are other larger entities that are walking around that may not be protecting our best interests some down to one minute. I think that will provide me either with a question or not Martin.
Yes. Thank you. And I, as I've said at the beginning, I don't have an Alexa, so I don't care. So to speak. No, I, I care, but at least it doesn't affect me, hopefully, unless I'm the one who tried this anyway, sent to someone else, is there also a sleep work and Alexa saying, Alexa, stop listening
Alexa off.
Okay. But
You can say Alexa bark and you can say Alexa, snore and Alexa Meow as well. Yeah. Alexa off though, of course is supposed to work the next time you say Alexa, which doesn't really mean it's off.
Okay. So thank you for this deep insights. A little bit scary towards the end of the day. Yep. So thank you very much. And again, a blow to thank you, mark. Steven.

Video Links

Stay Connected

KuppingerCole on social media

How can we help you

Send an inquiry

Call Us +49 211 2370770

Mo – Fr 8:00 – 17:00