Boffins convert typing sounds into text with 95% accuracy
Researchers in the UK claim to have translated the sound of laptop keystrokes into their corresponding letters with 95 percent accuracy in some cases.
That 95 percent figure was achieved with nothing but a nearby iPhone. Remote methods are just as dangerous: over Zoom, the accuracy of recorded keystrokes only dropped to 93 percent, while Skype calls were still 91.7 percent accurate.
In other words, this is a side channel attack with considerable accuracy, minimal technical requirements, and a ubiquitous data exfiltration point: Microphones, which are everywhere from our laptops, to our wrists, to the very rooms we work in.
https://www.theregister.com/2023/08/07/audio_keystroke_security/Open linkView original on kbin.social439
Comments145
New policy from the corporate office: If you are working in a public place, like a coffee shop, please scream while typing your login password.
I screamed my password and now I got hacked. Thanks for nothing!
use the onscreen keyboard
much more secure
why won't my bank stop calling me
On it boss https://youtu.be/HsvyjePPFRs?si=So4iKVWAUPXNjGVe
Here is an alternative Piped link(s):
https://piped.video/HsvyjePPFRs?si=So4iKVWAUPXNjGVe
Piped is a privacy-respecting open-source alternative frontend to YouTube.
I'm open-source; check me out at GitHub.
Use a speech to text and they won't be able to hear your keyboard strokes. I know, I'm a genius.
Who are you, who are so wise in the ways of science?
A duck.
Got any grapes?
I mean, I got lemonade…
But instead they would hear the speech and translate that to text. No need to thank me.
Quite scary considering the accuracy and how many open mics everyone is surrounded by without even realizing it. Not to mention if any content creator types their password while live streaming or recording they could get their accounts stolen.
One more reason to switch to a password manager, even though they could still find out the master password…
Probably still have some safety if you're using two-factor, or have a master key in addition to a password (e.g. 1Password).
Or use a local password safe like keepass.
Or host it yourself like the smart one you are
Only if you have to type it in to unlock your vault. Now, bear with me.
Bitwarden (maybe others) lets you set a PIN to unlock your vault. Normally, you would think this is a less secure setup, easier to crack with the method outlined in this article. Except with Bitwarden you have to set up the pin in every browser extension and every app install.
Meaning, unless they have access to your device, the PIN to unlock one instance of Bitwarden could be different from the PIN for another. They also don't have to be strictly 4-digit PINs, either. I highly recommend password managers, but for my money, Bitwarden has all my love.
Disclaimer: I am on no way affiliated with Bitwarden. But I could be if they paid me!
but then I have to remember the PIN for each one of my devices. there should be some kind of app for storing those.
Do what users at most businesses do, write it on a sticky note and put it on the underside of your keyboard!
Stick around for more tech tips with a real life sysadmin!
Password manager and the LOUDEST MECHANICAL KEYBOARD POSSIBLE you have NO idea what keys I’m pressing with my blues, bitches
That's the whole point though. The louder your keypresses the better.
I don’t think you read the article
A loud ass mech keyboard would fuck this study up
This has been a known attack vector for years, and I wonder how no livestreamer has been (publicly) attacked in this way.
I guess in large part this can be attributed to 2FA, passwords just aren't worth much by themselves anymore (well I guess if someone is quick enough they can snipe the OTP as well, but streamers are rarely entering their 2FA while streaming since they're on a trusted device).
In fact the biggest attack vector I'd worry about is the infamous SMS 2FA, which is actually 1FA for password resets, which is actually 0FA "yes dear phone operator I am indeed Mister Beast please move my phone number to this new SIM".
Obligatory XKCD
https://en.wikipedia.org/wiki/Rubber-hose_cryptanalysis
Neat, so when my friends are taking about satisfyingly clackety keyboards I can inform them it’s a security hazard.
I'll accept the risk. I need the clicky
Good luck, I have a non standard key layout
It's still vulnerable to dictionary attacks
Except it's not
??? If you can map sound to qwerty keystroke placement, then it's a simple matter of mono alphabetic substitution for other layouts to generate candidate texts. Using a dictionary attack to find more candidate layouts would absolutely work.
No, all the timings change. You can't just swap out the letters and hope it matches. Additionally I was responding to the poster claiming a dictionary attack on a password would work - only if it's in the dictionary.
The method is not based on timings. It is based on identifying the unique sound profile of each keystroke
How can you make that claim? They used deep learning, does anyone know what characteristics the AI is using?
Dvorak?
Good luck making an acoustic map of the tens thousands of possible case, switch and key cap combinations.
Middle management will finally get rid of clacky keyboards with this weird trick
This is why I always make sure there are no boffins around before I start typing.
If there are boffins around, I start typing out the GDPR guidelines in full
What about Hornblower's, Bolger's, Took's, Sackville's or Grubb's?
Not to be a jerk, but is this actually new? I've heard of this being done at least ten years ago...
On another note, one way to beat this (to a degree) would be to use an alternate keyboard like Dvorak (though you could probably code it to be able to detect that based on what's being typed)
I think it's largely been a state actor thing. Directional microphone to record your window from across the street, spend significant tax money on crunching numbers on a supercomputer to get at your password kind of thing, I think they already could do it in the 90s. Real-time 95% accuracy on a non-specialised device is a quite different ballpark: Now every skiddie can do it.
And this is the real, serious problem. Most people are pretty unlikely to stop a state sponsored spy operation no matter how careful they are. It's barely worth worrying about unless you know for a fact you're being tapped and that you will be killed about it, and even if you do know this the state can pull some space age bullshit out of their asses that doesn't yet have a counter. Top secret military industrial research goes into maintaining that exact advantage every year, if they really want to get you, you will get got. But if Joey Dickbeater and his school friends can just point a mic at your window and then upload it to the Pass-o-Gram to decode it, you have a real problem. It's like when TikTok kids figured out they can steal Kias with usb keys - if every teenager in America knows how to steal your car, its lifetime is going to be measured in minutes. Same with passwords.
Sounds like it's time to buy a bunch of random cherry switches and randomize them across my keyboard.....
And rotate them. While I don't plan to waste my energy, having hot swap sockets and swapping a few around should thwart the attack. You would have to do it frequently enough that relevant training data gets wasted before it's useful. I'm pretty paranoid, but not that much.
I'll just consider it good security hygiene to get a new keyboard often :)
Have you considered only re-doing the tinfoil wrapper every day? It should crackle differently every time.
What it means is that NIST probably needs to update its security recommendations to require hardware keys for even low level systems. It's going to be a huge pain in the ass though.
Gotcha, that makes more sense
Coding for alternate key mappings is almost as trivial as detecting other languages.
Yeah, that's what I figured
It's more trivial because it's a 1:1 relationship. A is a, s is o, d is e, and so on. Detecting other languages is harder because there's more of them and there isn't a 1:1 conversation to English.
There has been previous work on this, yes. It required a dictionary of suggested words. That would make it useful for snooping most typing, but not for randomly generated passwords. This new technique doesn't seem to have that limitation.
Okay, gotcha. I didn't look that deeply into it previously so I never realized how limited that was
So about those people that run around saying passphrases are better... 😅
I think I might have achieved security through obscurity. My custom keyboard is a unique shape and almost all the keys are one unit. Not only is it different enough from a traditional keyboard that the neural network probably won't understand it, the function layers I use obscure whether I'm typing a letter at all.
Does that come with free fingerless gloves?
No, but it comes with your choice of flavoured frozen yoghurt.
That's good!
The yogurt contains potasium benzoate
That'd bad
Of course not. The fingerless gloves are also niche, boutique, and premium.
I have a headache just looking at that.
What keyboard is it, corne? I have to admit that your keycaps are incredibly cursed, how you have mixed caps from different layers
It’s a chocofi.
CTGAP on the base layer, and 6 layers on top of it, using a heavily modified version of Miryoku.
Most of the keycaps are correct, just for different layers. It helps prevent key peeking, plus I like the cursed aesthetic.
that's a surprisingly cheap keyboard. I ended up ordering a zsa voyager a couple days ago because I wanted keys, but I couldn't find any prebuilt split keyboards that had a base configuration below like $350. I might end up going with cursed keys on mine, it looks pretty cool
I guess my typos are now a security feature!
I wonder if you need to train it on a specific keyboard before it will work it.
Most likely
That would limit the practicallity quite a lot, as deskmats and typing style would change the sound of even a common keyboard.
I also notice that I slightly change my typing style between typing normally and entering my password.
Eh... I don't know if it would be enough of a change. Also consider mass produced popular laptops (e.g. targeting the MacBook keyboard).
I don't really think that's normal... But hey, maybe it gives you some protection 🙂
I doubt so. Wouldn't Zipf's law be used for this?
I'd be curious how well this approach translates to multi-lingual keyboard layouts. For english users, perhaps theres another benefit to non-QWERTY layouts (e.g. Colemak or Dvorak) after all? ... and two factor authentication should remain helpful I presume. Especially physical key methods with no audible characters typed (e.g. Yubikey, Titan, etc.)
I was thinking the same, but it would be trivial for software to realize that “fnj xlg” maps to “the dog” with Colemak or Dvorak.
Can we normalise good but quiet keyboards. Like, I like the tactile feel of using a mechanical, but I hate the sound. Quieter mechanical keyboards aren't a thing but they should be. Now as a security measure if nothing else.
Also Dvorak keyboards I guess
There are tons of quiet mechanical keyboards. I'm using a low profile optical switch that's quieter than my mouse clicks
Are those optical switches expensive though?
No
There are definitely quiet tactile switches. The reason why they can still make sound is because they’re bottoming out which you don’t have to do.
As a partial solution, you can put o-rings in the keycaps. I had some of the bands for braces laying around at one point and used those, and it worked fairly well.
Dvorak is a cypher of Qwerty tho. Anything typed in Dvorak but transcribed as english can be reliably identified and decyphered
I went out of my way to find a keyboard with Cherry MX Clear switches. They're basically a high-force tactile feel, but no clicky sound like MX Blue switches. I absolutely love them for typing, and I've been using them for years.
I'm not sure if there's newer options now for silent switches? I know they had a couple models with extra internal damping.
I used boba u4 silents on my custom keyboard. Absolutely love them. Wish they made a consumer-grade keyboard with them (or maybe they already do?) But I've been working on a MacBook recently and tbh the keyboard there is pretty good now. So next step for me is to build a low profile keyboard
Oh man, if Topre became popular enough to bring the price down through scale that would be pretty rad
vi tho.
That awkward moment when you use Nano
I always uninstall nano the first time it shows itself
Up against the wall
On my colemak keyboard I put arrow keys on another layer under where hjkl are on qwerty. Beyond that, most of the keys are remembered by mnemonic rather than position imo
Some laptops like the Framework laptop have fingerprint sensors
Physical Security keys like NitroKeys or YubiKeys are another option
I don't see the relevance.
You can use fingerprint or U2F to unlock your password manager and copy the password. That way you don't have to type it in.
That does nothing for keylogging through this method.
It would have to be combined with a secure (no microphones) area during setup, but it seems like swapped biometric plus token would defeat this attack (password gathering). It would however not defeat generic data collection.
It would eliminate someone being able to get your username or password via this method though. Because you never have to type them in.
With my MacBook I can use either touchid or my watch to automatically unlock it, so I don’t even have to type my password in to get into my laptop. And then I use touchid and Keychain for all my passwords so I never have to type those in either.
I never learned to touch-type, so my typing style is very different from most people though I can type fast enough for work.
My typing style only uses 3 fingers, and both hands type keys in the middle of the keyboard.
I wonder if this has any effect on accuracy?
Edit: Article states touch-typing can reduce accuracy. Wonder if that's because they type more softly than us tech gorillas who tend to bash on the keys?
I'm a touch typist who can reach 160wpm when I'm really flowing, I would guess the speed makes accuracy harder to distinguish individual keys than you pressing keys with three fingers.
I type an awful lot slower than you, and still it's faster than I can think. How do you think of what to type fast enough to type at 160wpm?
Not the original person you responded to, but I type 120ish wpm. The trick is to try to tap into the same part of your brain that verbalizes words when you talk, rather than the part that composes stuff when you write.
That speed is usually transcription for me, I'm listening to someone and type what I hear. Actual writing and composing a thought typing speed is closer to 120wpm or so. I learned to type on a typewriter which is much slower, current low profile mech keyboard contributes to faster typing speed too.
That speed is something you only reach when using something like monkeytype.com, where it gives a continues list of words for you to type over some time, and then calculates the wpm. I manage about 140.
I could see that, makes sense.
Yeah, the article mentions that exactly - the faster you type the more the accuracy plummets.
I wonder if different switches, keycap profiles, keyboard material ect affect the accuracy?
How good does this work if there's other noise pollution? Like music playing etc?
Is it ignorance, indemnity, or conspiracy that this News Media Corporation didn't give the primary mitigation?
A white noise generator.
Isn't boffin a derogatory term like "nerd"?
What a dogshit headline.
Article also uses the term "eggheads".
It’s The Register - think the Financial Times for IT but in the style of The Sun/any other British tabloid. They do it for the lulz, if you will - don’t get too hung up on the headlines as the content is top quality.
It can be. Being a boffin, I'm not offended. Up to the individual if they choose to be offended.
Still shitty journalism to refer to researchers publishing their research in that way.
Meh, I wear such labels as badges of honor. I sacrificed a bit along the way to develop knowledge, skills, competence - I've earned it. Thanks for acknowledging it.
I also see such things in a humorous light. I mean us "boffins" can be such boffins at times. We can over-focus, get caught up on perfectionism, etc, etc. If'n ya can't laugh at your own foibles, well, I don't know what to say.
Maybe a US/UK divide? At least in the UK boffin is relatively inoffensive depending on how it's used. Eg if I build a fusion reactor in my garden my neighbour might say "wow, look at what this boffin did!" and it would be a complement where boffin is a stand in for a word like genius, only with a tounge in cheek touch of jealousy.
Thinking about it I would say that 'nerd' is typically putting someone down for their intelligence or interests, whereas boffin is a light insult while identifying the 'boffin' as being smarter than yourself.
Might have to spend some time getting Easy Effects/Noise Torch set up on my systems again just to reduce the vectors again.
There is a good comment on this post on physical mitigation that seems helpful as well: https://www.reddit.com/r/Fedora/comments/uerp9z/comment/i6p0jqa/?utm_source=share&utm_medium=mweb3x&utm_name=mweb3xcss&utm_term=1&utm_content=share_button
This is old news. This article was published on 7 Aug 2023.
This method is far older than that, and it keeps popping up every so often as a "new" attack. First time I read about this method was in the early 2000's, and I'm pretty sure it been done before that as well.
Another advantage to the split keyboard
This attack is useless in the real world.
That said, what gives you the idea a split keyboard (if they had a sample of you typing on it etc) would be any different than a normal one?
It is just another keyboard with a different sound profile.
You can remap and customise keys to be whatever you want. There's even auto shift, so if I hold certain key just a bit longer than a regular tap, it will automatically capitalise or whatever the shift + key combo would result in. There are also multiple layers you can easily activate with a press of a button, so the layout is something totally different.
Example: https://configure.zsa.io/moonlander/layouts/default/latest/0/
Layer 0
Layer 1
Layer 2
It was just a joke
This is the best summary I could come up with:
In other words, this is a side channel attack with considerable accuracy, minimal technical requirements, and a ubiquitous data exfiltration point: Microphones, which are everywhere from our laptops, to our wrists, to the very rooms we work in.
To make matters worse, the trio said in their paper that they've achieved what they claim is an accuracy record for acoustic side-channel attacks (ASCA) without relying on a language model.
Luckily in this case it's not power usage, CPU frequencies, blinking lights or RAM buses leaking data unavoidably, but a good old-fashioned problem occurring between the computer and chair that can actually be mitigated somewhat easily.
The researchers note that skilled users able to rely on touch typing are harder to detect accurately, with single-key recognition dropping from 64 to 40 percent at the higher speeds enabled by the technique.
Working among the clacking of phantom keyboards would surely annoy everyone, which is why the researchers suggest only adding the sounds to Skype and Zoom transmissions after they've been recording instead of subjecting employees to real-time noisemakers.
Followup research is now going on into using new sources for recordings, like smart speakers, better keystroke isolation techniques and the addition of a language model to make their acoustic snooping even more effective.
The original article contains 656 words, the summary contains 210 words. Saved 68%. I'm a bot and I'm open source!
Idk how it works with non-NVIDIA GPUs but get Nvidia Broadcast or an equivalent. Its a life saver.
macOS Sonoma has just updated with camera effects/reactions and "voice isolation" which works just like NVIDIA Broadcast/RTX Voice, luckily.
It doesn't do a very good job of removing my keyboard noise for some reason, and it makes my voice sound noticably worse 😔
Mines perfect, my baby can't even scream in my mic. It gets caught. I don't recall messing with settings, and my GPU is a 2080 TI. Idk, hardware maybe? Theres not much to mess with.
Because of different placement on the keyboard and different finger pressure, each key press has a slightly different sound.
The telling thing in this story is this
For some people (those with a very consistent typing style on a known keyboard) they were right 95% of the time.
In the real world this type of thing is basically useless as you would need a decent sample of the person typing on a known keyboard for it to work.
So to do this you need to have physical access to the person (to place a microphone nearby) and know what type of device they are typing on and for it to be a device that you have already analysed the sound profile of.
The article says
Hm. Sounds like "some cases" are hunt and peck typists or very slow touch typists.
I don't know if training for each victim's typing is really needed. I get the impression they were identifying unique sounds and converting that to the correct letters. I only skimmed and I didn't quite understand the description of the mechanisms. Something about deep learning and convolution or...? I think they also said they didn't use a language model so I could be wrong.
The problems is that even with up to 95% accuracy that still means the with a password length of 10 there is a 50/50 chance that one character is wrong.
A password with one character wrong is just as useless as randomly typing.
Which character is wrong and what should it be? You only have 2 or 3 more guess till most systems will lock the account.
This is an interesting academic exercise but there are much better and easier ways to gain access to passwords and systems.
The world is not a bond movie.
Deploying social engineering is much easier than this sort of attack.
"Hearing" the same password twice drastically increases the accuracy, however, social engineering is indeed the most effective and efficient attack method.
If the password is not random, as they seldomly are, you can just guess the last, or even the last few characters of they are not correct.
Have you never seen a Bond movie? Yeah they always have a gadget or two, but the rest is basically him social engineering his way through the film. And shooting. Usually lots of shooting too.
I was thinking of this attack in terms of grabbing emails, documents, stuff like that. Or snippets thereof.
I imagine it probably also uses an algorithm to attempt to "guess" the next letter (or the full word itself, like your phone keyboard does) based on existing words. Then maybe an LLM can determine which of the potential words are the most likely being typed based on the context.
I dunno if that makes any sense, but that's how I pictured it working in my brain movies.
You don’t need physical access, just some malware that has access to the microphone
We would hope researchers “discovering” this wouldn’t have a production ready product as their proof of concept. So there is room from improvement but military contractors would love to invest in this
Which you still need to have previously installed...
If the person has allowed malware to be installed just install a keylogger (which gives you 100% accuracy every time) rather than jump through more hoops with this.
Different devices
I would have an easier time infecting someone‘s personal phone than a company machine
You would, would you?
Well, I must be talking to a leet hacker then.
Ok, install malware on my phone.
How did you get that from what I said?
What did you mean by this then other than you, personally, are skilled at such things and have system penetration experience?
They'll have modelled the acoustic signals to differentiate between different keys. Individual acoustic waves eminating from pressing a key will have features extracted from them to identify them. Opimal featues are then choose to maximise accuracy, such as features that still work when the signal is captured at different distances or angles. With all these types of singsl processing inference models, you never get 100 percent. The claim of 95 percent is actually very high.
Every key is unique and at a different distance to the microphone and therefore makes tiny differences in noise.
Knowing this, and knowing the frequency distribution of letters in language (e.g. we know "e" is the most common letter) and some clever analysis over a large enough sample of typing, we can figure out what each key sounds like with a statically high level of probability. Once that's happened it's just like any other speech recognition software, except it's the language of your keyboard.
This is just me kindof guessing off the top of my head, but:
Now, the researchers didn't sit down and list out all of these (or any other) ways in which software could determine what was typed from audio and compose an algorithm that accounted for all/most/some of these. They just kindof threw a bunch of audio with accompanying "right answers" at a machine learning algorithm and let the algorithm figure out whatever clues it could discern and combine those in whatever way it found most beneficial to come up with an (increasingly-more-accurate-with-every-training-set) answer. It's likely the algorithm came up with different things than I did that helped it determine which key(s) were being pressed.
Article doesn't say but I would guess they are testing with words and using that to build context for better accuracy. I imagine if you are typing some random password it would not be as accurate. Also the only password I type nowadays is the one to unlock the computer, everything else is in a password vault.