I think this is saying microsoft can do a timing attack on the encrypted packets going between you and your AI chatbot and figure out what you are talking about.
I am not convinced, but it is yet another reason to run your own models. This is probably a pretty good idea if you are planning on some money laundering. More "societally charged" subjects in the US in the current climate could be considered the same.
Whisper Leak: A novel side-channel attack on remote language models
Microsoft has discovered a new type of side-channel attack on remote language models. This type of side-channel attack could allow a cyberattacker a position to observe your network traffic to conclude language model conversation topics, despite being end-to-end encrypted via Transport Layer Security (TLS).
For many of the tested models, a cyberattacker could achieve 100% precision (all conversations it flags as related to the target topic are correct) while still catching 5-50% of target conversations. In plain terms: nearly every conversation the cyberattacker flags as suspicious would actually be about the sensitive topic—no false alarms. This level of accuracy means a cyberattacker could operate with high confidence, knowing they’re not wasting resources on false positives.
To put this in perspective: if a government agency or internet service provider were monitoring traffic to a popular AI chatbot, they could reliably identify users asking questions about specific sensitive topics—whether that’s money laundering, political dissent, or other monitored subjects—even though all the traffic is encrypted.
I am not convinced, but it is yet another reason to run your own models. This is probably a pretty good idea if you are planning on some money laundering. More "societally charged" subjects in the US in the current climate could be considered the same.
Whisper Leak: A novel side-channel attack on remote language models
Microsoft has discovered a new type of side-channel attack on remote language models. This type of side-channel attack could allow a cyberattacker a position to observe your network traffic to conclude language model conversation topics, despite being end-to-end encrypted via Transport Layer Security (TLS).
Whisper Leak methodology
In our experiment, we train a binary classifier to distinguish between a specific target topic and general background traffic. We chose “legality of money laundering” as the target topic for our proof-of-concept.- For positive samples, we used a language model to generate 100 semantically similar variants of questions about this topic (example, “Are there any circumstances where money laundering is legal?”, “Are there international laws against money laundering?”). Eighty (80) variants were used for training and validation, and 20 were held out for testing generalization.
- For negative noise samples, we randomly sampled 11,716 unrelated questions from the Quora Questions Pair dataset, covering a wide variety of topics.
- Data collection was performed for each language model service individually, recording response times and packet sizes via network sniffing (via tcpdump), shuffling the order of positive and negative samples for collection, as well as introducing variants by inserting extra spaces between words to avoid caching interference risk. We chose a standard of language model temperature = 1.0 to encourage language model response diversity.
- LightGBM: A gradient boosting framework.
- LSTM-based (Bi-LSTM): A recurrent neural network architecture suitable for sequential data.
- BERT-based: Using a pre-trained transformer model (DistilBERT-uncased) adapted with extended tokens representing size and time buckets for sequence classification.
What this means in the real world
To understand what this means practically, we simulated a more realistic surveillance scenario: imagine a cyberattacker monitoring 10,000 random conversations, with only one conversation about the target sensitive topic mixed in. Even with this extreme imbalance, our analysis shows concerning results.
For many of the tested models, a cyberattacker could achieve 100% precision (all conversations it flags as related to the target topic are correct) while still catching 5-50% of target conversations. In plain terms: nearly every conversation the cyberattacker flags as suspicious would actually be about the sensitive topic—no false alarms. This level of accuracy means a cyberattacker could operate with high confidence, knowing they’re not wasting resources on false positives.
To put this in perspective: if a government agency or internet service provider were monitoring traffic to a popular AI chatbot, they could reliably identify users asking questions about specific sensitive topics—whether that’s money laundering, political dissent, or other monitored subjects—even though all the traffic is encrypted.
Last edited: