Hacker News new | comments | show | ask | jobs | submitlogin
Amazon can't yet completely delete Alexa voice transcriptions (www.zdnet.com)
99 points by FrancesFinTech 8 days ago | hide | past | web | 96 comments | favorite

It's a company with the time, resources, and customer awareness to do this. They can. They have chosen not to.

Why people are acting surprised about this? To me it has been always more than obvious that whatever you say to any of the internet connected “assistants” will be stored and kept as long as the assistant owning company likes.

There's a difference between "it's obvious that they're doing it" (which is only true for people above a certain paranoia line) and "here, look at this proof that they're doing it." There are plenty of things that are obvious to me based on my idea of people's motivations and past behavior, but if I talk to someone lower down on the paranoia spectrum they don't believe me - unless I can point to evidence that they're actually doing it. (A fair thing to ask for, after all.)

This reminds me of the announcement that Google will let u delete your history.

Yeh, right, when pigs fly. Your history powers their "AI". Aint no unlearning that.

It seems like you're talking the difference between suspicion and proof? Saying "it's obvious they're doing it" when you mean you suspect they're doing it is just going to start an argument.

I was using it in the sense of the parent comment, I.E. if you put $100 in a room unsupervised with a career thief it's "obvious" that he'll walk out with it. Another way of phrasing it would be, "anyone who isn't suspicious appears negligent."

It's not obvious from interacting with the device that your voice data leaves, or that there's any permanent record made of your commands. That's something you have to know ahead of time.

You don't think it's obvious given that the device is completely non-functional without internet access?

I think this is obvious however I also understand that most people aren't that clever/interested in these things, and would not realize it. This creates a dilemma of how to message this information in a way that is all at once concise so as to not waste people's time, informative, and not alarmist which freaks people out and is counter productive not only to having a viable product but also to the general adoption of technology.

What's obvious to tech workers is not obvious to the general population. Most people don't give a second thought to putting all of their communications with friends and family on facebook's unencrypted messenger app (apparently the new version uses E2E encryption but those millions of person-years of chat logs aren't going anywhere).

Exactly what I thought. I imagined it was clear that anything this thing hears can be kept. I bet the terms of use explicitly say that they can, in readable English

It's not the surprising at all. It is an absolute disgrace though.

Just yet another reminder that GDPR is the bare minimum for something like internet to be tolerable.

So like probably the majority of useful text boxes everyone types into on the internet. Great news, thanks for the info

This is why we are building 100% offline and private-by-design Voice AI at https://snips.ai, it is free for makers and works in english, french, german, japanese, spanish, italian, and more coming!

I don't see any guarantee that if the platform gets popular and the company acquired, privacy would stay the same. I have seen the same pattern too many times...

Thank you for that. Voice recognition makes for awesome devices, but we dont need every gadget connected to the net sending our words elsewhere. Keep up the good work!

Somewhat related: is there a lib or common api for these IFTTT type actions at this point or is everyone building them from scratch?

This is exactly what I was looking for and was my only qualm about writing a voice assistant! Will check this out

Which voice engine do you use?

Thanks to the sacrifice of these clueless users (or at least a good part of them are) the era of offline assistants is near looking at what Google has shown recently.

> Thanks to the sacrifice of these clueless users (or at least a good part of them are) the era of offline assistants is near looking at what Google has shown recently.

I think I'm being clueless, but I can't figure out what this sentence means. Is there a typo in it?

I think gvand means that these users are training the AI models to eventually provide a similar level of service offline. The "cluelessness" is just a value judgment of the users.

I was referring, in a convolute way, to the fact that all the data collected have been/will be used to train the models that will allow offline voice recognition (like the one google has shown at i/o last week).

I might be mistaken, but the reason we don't see offline recognition (amongst other things) is hardware limitations, not the lack of training data. The small onboard chip doesn't have that much compute power, so they offload to more powerful Amazon/Google servers that can run the inference.

> amongst other things

I think that this is an important point. Obviously there's more computing power available in Apple/Google/whoever's data centres than on my device, and I'm sure that is, or at least was, a concern; but I also don't believe that they are indifferent to the utility of sitting on such a huge volume of user-submitted, real-world data.

Which internet company isn't perma-storing any and all 'valuable' bits that happen to be routed through their network?

In theory DuckDuckGo. Just because someone can, does not make it good/right.

is there a law regarding whether the presence of these bugs should be disclosed when you are a guest in a business premises or someone's home ?

I can’t believe that people actually pay for these things... Alexa has always seemed like a gimmick to me. I have an Ecobee smart thermostat that has integrated Alexa, which I promptly disabled after installation.

What's wrong with gimmicks? Voice control itself is not a bad idea, the interface to control your technology is something you are born with :)

Gimmicks are short-lived for fundamental reasons. I'm going to sound like a broken record but, as someone who has owned multiple Alexa devices and quit their job to start a voice-UI business(didn't pan out), the current generation of "smart speakers" are a joke when it comes to doing anything serious besides asking for the weather or toggling light switches. I have no doubt that they are useful to people in very limited ways, but they are still sold as being a lot smarter than they really are. For what they provide, the privacy trade-off is too great for me.

Alexa is a gimmick because it's a speech-to-text command line, and it's sold as being smart even though it's not. Since before I was a kid in the 90s, there have been many attempts to revolutionize computing with speech-to-text technology. Because speaking comes so naturally to us, it's easy to assume that voice-activated anything is better than pushing buttons. In reality, without intelligence and autonomy, lots of interfaces are made slightly worse with voice activation. For those who aren't visually impaired, the ability to use voice to turn off lights barely even makes sense. Alexa frequently gets things wrong and activates from sounds that aren't even close to the wake-word. The ability to create lists is barely practical because it so often can't understand a word, in which case the user has to go to the Alexa app and manually punch in the item.

Voice control would be great if it were revolutionized, but it's hardly in a different state than it was decades ago. The only two things that have changed are improved speech synthesis and ample cloud computing. Because of this, most people I know who own one barely use them beyond a select number of features that are hard to get wrong like "Alexa, what's the weather?" or "Alexa, what time is it?". My parents still sometimes use it to play music(which I gave up on as a music fan), but it gets requests hilariously wrong 1/5 times.

"Besides [a bunch of useful stuff], Alexa is useless! Therefore it's a gimmick."

I have an Echo Dot which I use almost exclusively for a few static purposes: the weather, setting alarms, turning my lights on and off, and asking what time it is.

I also ask it basic questions like whether various sports teams won or what time they're playing, which it also answers well.

Not sure why you think it is useless just because it's not a magical general AI that can do everything.

I find it extremely useful.

> For those who aren't visually impaired, the ability to use voice to turn off lights barely even makes sense

I'm not visually impaired and I use this feature all the time. It allows me to turn off the light when reading in bed without having to get up and walk to the light switch.

> most people I know who own one barely use them beyond a select number of features that are hard to get wrong

Yeah, exactly. How does this make it useless?

The point is that these few useful things can be all achieved much cheaper with much greater ergonomics.

> the weather, setting alarms, (...) and asking what time it is.

You can do that on your phone. Even assuming you wanted it hands-free, it doesn't justify an always-on microphone sending data to the cloud. We had the tech to do this level of voice recognition reliably in the 2000s.

> It allows me to turn off the light when reading in bed without having to get up and walk to the light switch.

Kids from my generation used to solder clap detectors for like $5-$10, and they're already more reliable and faster to use.

Voice is cool, it's like being in Star Trek. I get it, I built my own system to control music in the 2000s, complete with audio responses snipped from Star Trek shows. But the feeling of "living in the future" wears off pretty quick, and you're left with a ridiculously expensive and user-hostile gimmick.

an always-on microphone sending data to the cloud.

This is a dishonest characterization of how every smart speaker in existence works. There is no continuous stream as this statement implies.

There is a continuous buffer of a couple of seconds for the device to locally catch a wake word. (You can verify this by disabling the device's internet connectivity - it will still catch the wake word and speak an apologetic message about not being able to connect). Also, changing the wake word requires a full restart, which says "firmware" to me.

After the wake word is spoken, that buffer and anything immediately after it is what gets sent up to the cloud for voice recognition.

The buffer serves a purpose in that it prevents an awkward pause between the wake word and the action reqeusted. (So you get to do "Alexa, turn on the lights" rather than "Alexa? bong Turn on the lights.")

The "user hostility" and "gimmick"-ness of this design is entirely subjective and quite overblown, in that "nobody will ever use Dropbox when they could just use rsync and a cronjob"-type bias that HN tends to have.

I'd say it beats the alternative from a pure functionality standpoint.

You use case almost exactly mirrors my own. Even if weather, alarms, time & lights was all it ever did, it has 100% been worth the purchase price.

It is genuinely useful to have a no-hands-required timer in the kitchen, and being able to turn off the bedroom lights when I'm done reading for the night without having to reach for a switch is great.

I was even pleasantly surprised by how much I enjoyed Alexa's Skyrim. Sure, it's really more of a joke as it is, but it made me think that some choose-your-own-adventure skills would be a lot of fun.

(Asking Alexa to play white noise to help me sleep has been nice, too.)

The issue here is that a not too powerful local device could do that without recording everything you say in the cloud and keeping it forever with a label with your name on it.

Call me crazy but, but paying 30 dollars to lose all of my privacy just so I don't have to get up to turn off the light at night seems like a poor trade. Maybe not useless, but I honestly think you can do better.

And a basic raspberry pi could be programmed to cycle through the weather, alarms, lights, time. I guess the crux of doing it that way however, is taking personal responsibility for security, which still seems better if you're slightly lax at it, than sharing your "house microphone" with a multi-billion dollar company with motivation to exploit it.

I already own other microphones, for example my smartphone, laptop, various earbud/microphone combos, etc.

I already trust all those manufacturers not to secretly upload everything I say to the cloud. Why is amazon any different?

I fairness to your bed time reading example, it’s very common for people to have bed side lights - even if they’re just little lamps that rest on a table. At least this is the norm in the UK. So for a great many of us, turning our reading light out would just be an extension of our arm.

My wife tends to prefer using the TV as her reading light. I find that rather bizarre but she likes the background noise. In any case, TVs have remote controls and sleep timers so that mitigates the need for voice control.

To be honest, I couldn’t think of a place I’d less want an Echo than the bedroom. Even the bathroom seems less inappropriate (eg you might want music when in the shower / bath).

True, but not everyone does, and for them, the Echo provides extra functionality. I've also added a lot of dimmable lights in my home, and voice control to set a light or set of lights at 50% is much much less impactful to whatever I'm doing than pulling out a phone.

Why am I getting downvoted? Do people disagree that it’s common for bedrooms to have lamps? A little context would be appreciated because my comment is 100% accurate in terms my own experiences yet several people seem to disagree. Genuinely interested to know why.

>the current generation of "smart speakers" are a joke when it comes to doing anything serious besides asking for the weather or toggling light switches.

The primary use of my Alexa devices is being a voice-controlled IP radio. I paid about $300 many years ago for Logitech/Slim Devices' crack at this, the Squeezebox, and I loved it.

Now, I get that same functionality for $40 shipped with more on top. Everything else is a bonus. including the smart home stuff - being able to turn the lights on with grocery bags in hand is damn futuristic.

Does it screw up sometimes? Sure. But it beats the hell out of a keyboard or dials for the use case.

I've got a setup of google home devices, and that's pretty much all I use em for - the occasional stupid question and the rest of the time it's playing music that I can control with my voice, which I find incredibly convenient.

I completely agree with every single word you’ve posted there. From the history recap to the modern era problems.

These days the only thing my Echo does which is remotely useful is setting named timers while cooking. So I can have my hands dirty with raw meat and ask Alexa to set various timers for each step of the meal. I found that particularly good when cooking meals that have large gaps in time between stages (like Sunday roasts when there can be 5 minutes or more between the cooking times of different vegetables). However even there it sometimes becomes more trouble than its worth when it starts mishearing names of vegetables or duration numbers when spoken.

The most disappointing thing is that I spent a few days working with Alexa’s - frankly terrible - SDK to integrate it into my existing home automation (all stuff I’ve built myself and powered by a FreeBSD server). Not only was the development progress of Alexa skills amongst the most frustrating I’ve had in my ~30 years of experience writing software; but it turned out to be a complete waste of my time because Alexa is so piss poor at any interactions more complex than the very basic (as you described). It’s also very laggy at such interactions so even when it does work it feels slow. So slow, in fact, that it ends up taking longer and being more painful using the voice control than it would have been to wake my laptop from sleep and trigger the same HTTP API endpoints Alexa would used but instead doing so manually from the command line using curl. So needless to say I very quickly gave up using Alexa for home automation.

I wasn't born knowing how to speak. I had to learn. Babies can use their fingers to point long before they can use their mouths to speak.

Also, based on every other case where I've tried voice recognition, I have to learn to speak in an extra-distinct and artificial way for a computer. (I don't think I have an unusual accent, but I can say "TRACK A PACKAGE" into the receiver all day long and not get where I want to go, even though no human would have trouble with it.) What's the point in that?

Did you write that comment with speech-to-text? Why not? If it's such a good interface, why use a keyboard to write text?

No, it's something you acquire--no one is born speaking.

The interface is sound and you can generate sound with your organs if you want to go full pedantic.

In that case every computer interface is controlled by things you’re born with.

People are also born with fingers.

It's similar to mobility scooters. Made for handicapped, but also popular among the lazy.

ha, great analogy but without the snooping long tail result of usage...

Me too, but now they are even giving them away. Google has offered me a free Google home several times now. Thank you, I don't need your spy device in my home.

So, for me, it is a glorified cooking timer and alarm clock. If I could buy a closed unit that just did that, then I would be happy.

You mean.... your phone, right?

The timer that is part of iOS is not as good as Alexa's. With Alexa you can easily set concurrent timers and label them. Siri doesn't know how to do that.

A cellphone is pretty much the absolute nadir of "closed unit" devices (in the sense the prior poster was describing) that could be possibly imagined.

How else would you use the word "nadir"

"Lowest point" basically.

I.e. "Political threads are the nadir of discussion on HN"

I don't want to touch the phone with raw meat juice; voice commands plus learning to cook has the potential for amazingness.

I'm going to presume that you have Siri enabled... no touching required, right? Just knuckle the thing. Curl the pinky finger and tap w/ the knuckle. 0_o

I don't have Siri, and I generally don't have my phone near me all the time. Same for the wife, it's super awesome to just be able to talk to Alexa. I share the privacy concerns, so I am hopeful https://mycroft.ai being a suitable replacement as I have ordered and plan to build a better rig to handle the deepspeech.

I use my voice controls for 3 big reasons daily: 1) Timers (laundry, cooking, etc) 2) Music 3) Smart home stuff

On the 3rd one, voice actually made the smart home easier. To use a smart light bulb you had to unlock your phone, open an app, login, and do your thing. Now I tell my Alexa/Google to do it and it's super easy.

Also the chromecast integration on Google Home is killer "OK google play pandora on TV" or "Play xyz on youtube on tv"

I only use my voice controls once a day. When I come home, Siri automatically turns on my lights. When I go to bed, I shout "Hey, Siri, turn off the lights."

(10% of the time she responds with, "OK, your six a-m alarm is off." I don't know why.)

Honestly it took me awhile to get to this state (aside from the fact I work with this stuff as my job). It was almost like the transition of getting a smart watch. Once I forced myself to do certain things with it, you get into a state where you realize how much easier it is to use it.

I can't believe people actually pay for mobile phones. They seem like a gimmick to me. My rotary phone works just as well.

I feel like your analogy is predicated on missing some pretty important facts. Sales and usage numbers, maybe. Obviously the fad for communicating between humans will gradually fade, as the time-proven wonders of shouting at a rock on your countertop demonstrates infinite utility.

This is, while sad and even maddening, obvious.

It's obvious to anyone who spends a moment thinking about it that some portion of what you say remains.

What's less obvious is that they store everything and most definitely index it so it can be used later against you (all it takes is one legal action - separation, police, you name it).

What's further disappointing is that Amazon stores the transcribed text. Which may be incorrect but deemed "truth".

I told alexa to go away. It did not. It just persists on nagging me to say things other than alexa go away

Well color me surprised.

In the EU this is a violation of GDPR if true.

Not sure why you're being downvoted. Is it not a GDPR violation?

Agreed, I was about to say the same thing. If it's their data (i.e. data the customer generated and stored on the service), and there's a way to validate that it's theirs (as it exists in the data store) based on information the customer provided, I'm pretty sure the GDPR requires deletion on request.

Thanks! We'll change to that from https://www.zdnet.com/video/amazon-cant-yet-completely-delet..., which is a video.

Needs [video]

Thanks! Updated.

Cache issue? I still don't see [video] in the title.

The url was changed to a text article.

interested to know how much amazon is spending to store hundreds of recordings of me saying, "HDMI1"

According to this video they're only keeping text transcripts, so... maybe a penny over your entire lifetime.

I hope a bug doesn’t cause your assistant to accidentally record everything while it’s listening for your prompt.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact