An AI prompt that extracts personal data from chatbots

Chatbots are used across industries, particularly for customer service and consumer service purposes. For customers of a business, they’re a useful tool. They can answer routine questions and bring up stored data on request, minimising the amount of time you have to spend on the phone, on hold, waiting to talk to a human to complete a mundane task.

When users interact with a chatbot, there’s usually an exchange of personal information. You might share your name, date of birth, and your address; and maybe even personal preferences and interests that relate to the services you’re accessing. This creates risk: because when you share personal data with a chatbot, your data is vulnerable if the Large Language Model (LLM) behind that tool becomes vulnerable to attack.

An AI prompt that extracts personal data from chatbots

Now, security researchers at the University of California, San Diego (UCSD) and Nanyang Technological University in Singapore have uncovered a new type of attack. An AI prompt triggers an LLM to collect your personal information from chats, and send it straight to a threat actor.

A new chatbot attack

In a paper published on October 17 2024, the researchers named the attack Imprompter, and explained how it uses an algorithm to turn a prompt supplied to the LLM into a set of malicious instructions that are hidden from the user.

Through their investigations, the researchers were able to “surface a new class of automatically computed obfuscated adversarial prompt attacks that violate the confidentiality and integrity of user resources connected to an LLM agent.”

And they demonstrated a number of use cases for the prompt algorithm on different chatbots, including Mistral LeChat (with an 80% success rate for the attack) and ChatGLM.

Through a range of these experiments, it became clear that these attacks “reliably work on emerging agent-based systems like Mistral’s LeChat, ChatGLM, and Meta’s Llama.”

Finding vulnerabilities in popular LLMs

Following the report on Imprompter, Mistral AI told Wired it had fixed the vulnerability that enables the algorithm to work. And ChatGLM issued a statement emphasising it takes security very seriously, without offering direct comments on this particular vulnerability.

Aside from this particular form of attack, vulnerabilities in popular LLMs have been a growing problem since the 2022 release of ChatGPT.

As reported by Wired, these vulnerabilities often come under two broad categories:

Jailbreaks. These trick an AI system into bypassing its own safety rules, using prompts that override the AI model’s settings.
Prompt injections. These work by feeding a set of instructions from external data into the AI to tell them to steal or manipulate data. For example, this might be a concealed prompt on a website that the AI absorbs if it takes information from that page.

Prompt injections are a growing concern because they’re very difficult to protect against. They turn the AI against itself, and researchers aren’t 100% sure they understand how it happens. It’s possible that LLMs are learning obscure connections that go beyond natural language – working in a language that human beings don’t understand.

And the outcome is that the AI follows a prompt injected by a threat actor and supplies sensitive data to a malicious website for the hacker to use as they wish.

The bottom line is that right now, any LLM that handles personal data should do so with great care, and be subject to extensive and creative security testing. And any person who inputs their data into an LLM should be aware of the risks – consider how much information you’re giving away, and what that data could be used for if it were stolen.

Join us at MEA 2024 and discover how to improve your organisation’s cyber resilience.

你可能想看：

5. Collect exercise results The main person in charge reviews the exercise results, sorts out the separated exercise issues, and allows the red and blue sides to improve as soon as possible. The main

As announced today, Glupteba is a multi-component botnet targeting Windows computers. Google has taken action to disrupt the operation of Glupteba, and we believe this action will have a significant i

(3) Is the national secret OTP simply replacing the SHA series hash algorithms with the SM3 algorithm, and becoming the national secret version of HOTP and TOTP according to the adopted dynamic factor

It is possible to perform credible verification on the system boot program, system program, important configuration parameters, and application programs of computing devices based on a credible root,

Detailed Explanation of VM Virtual Machine Protection Technology & Analysis of Two CTFvm Reverse Engineering Practical Exercises

Distributed Storage Technology (Part 2)： Analysis of the architecture, principles, characteristics, and advantages and disadvantages of wide-column storage and full-text search engines

In-depth Analysis and Practice： Analysis of Apache Commons SCXML Remote Code Execution Vulnerability and POC EXP Construction

4.5 Main person in charge reviews the simulation results, sorts out the separated simulation issues, and allows the red and blue teams to improve as soon as possible. The main issues are as follows

2. First filter out educational sites from the imported text, and do not perform friend link crawling

Data Compliance for Enterprises Going Global： The 'Unavoidable' Extraterritorial Jurisdiction of GDPR