AI chatbots can guess your personal information based on what you type

The way you are Conversations can reveal a lot about you – especially if you’re talking to a chatbot. New research shows that chatbots like ChatGPT can infer a lot of sensitive information about the people they’re chatting with, even if the conversation is completely mundane.
The phenomenon appears to be due to the way the models’ algorithms are trained on large amounts of web content. This is a big reason why they work, which probably makes it difficult to prevent. “It’s not even clear how to fix this problem,” he says Martin Vechev, a computer science professor at ETH Zurich in Switzerland, who led the research. “This is very, very problematic.”
Vechev and his team found that the large language models that underlie advanced chatbots can accurately infer an alarming amount of personal information about users—including their race, location, profession, and more—from conversations that seem innocuous.
Vechev says fraudsters could use chatbots’ ability to guess sensitive information about a person to harvest sensitive data from unsuspecting users. He adds that the same underlying capability could herald a new era of advertising, in which companies use information collected by chabots to build detailed user profiles.
Some of the companies behind powerful chatbots also rely heavily on advertising for their profits. “They could already be doing it,” Vechev says.
The Zurich researchers tested language models developed by OpenAI, Google, Meta and Anthropic. They say they have made all companies aware of the problem. OpenAI, Google and Meta did not immediately respond to a request for comment. Anthropogenic describes it Privacy Policywhich states that no personal data will be collected or “sold”.
“This certainly raises the question of how much information about ourselves we inadvertently reveal in situations where we might expect anonymity,” he says Florian Trameran assistant professor, also at ETH Zurich, who was not involved in the work but saw details at a conference last week.
Tramèr says it is unclear to him how much personal information could be derived in this way, but he speculates that language models could be an effective aid in uncovering private information. “There are probably some clues that LLMs are particularly good at finding, and others where human intuition and prioritization are much better,” he says.