- Scientists have discovered that pictures made out of symbols and letters can make AI chatbots forget to stop bad responses, which could make them give directions for illegal things.
- ArtPrompt is a new way to trick AI chatbots by using ASCII art to hide bad prompts. This makes chatbots not follow safety rules and say things they shouldn’t.
- This problem is similar to past attacks where harmful commands were inserted, showing how hard it is to protect AI systems from tricky inputs.
Scientists found a big problem with AI chatbots. They discovered that ASCII art can make the chatbots unable to stop harmful messages. This new information reveals a new way to attack AI assistants called ArtPrompt. It uses ASCII art to get around safety features in popular AI assistants like GPT-4 and Google’s Gemini.
This discovery shows that AI systems can be easily attacked using ASCII art manipulation, and it is difficult to protect them from these kinds of attacks. ArtPrompt is a new way to confuse AI chatbots by using different kinds of questions or information. This could be dangerous for the safety and security of AI.
Hacking AI chatbots – The art prompt attack
ArtPrompt, a new tactical move, has shown a major weakness in the protection of AI chatbots. By using pictures made from symbols in prompts, the plan avoids strong defenses against bad or unethical answers from chatbots.
The way this clever attack works is by replacing one word with ASCII art in a message, which confuses AI chatbots. As a result, these advanced computer programs get tricked by what they see and don’t realize the danger in the request. This causes them to give a wrong and out-of-place response.
The experts at ArtPrompt say that it works well because it uses the way AI chatbots understand language. These chatbots are really good at understanding and responding to text, but they struggle to understand ASCII art.
So, their ability to understand and figure out specific words in ASCII art is limited. This situation causes a problem where the chatbots get distracted by trying to understand ASCII art and end up going off track from the safety rules. This could lead to harmful responses.
Past weaknesses and things we’ve learned
The problem shown by ArtPrompt is not the first time that AI chatbots have been tricked by cleverly made inputs. Prompt injection attacks have shown how chatbots like GPT-3 can be controlled to make embarrassing or nonsensical responses by inserting certain phrases into their prompts. This was first documented in 2022. In the same way, a student at Stanford University found out how to trick Bing Chat into giving a specific response, showing how hard it is to protect AI systems from these kinds of attacks.
Microsoft has admitted that Bing Chat can be easily attacked by prompt injection. This shows that it is still difficult to keep AI chatbots safe from being manipulated. Although these attacks may not always cause harm or unethical behavior, they make people worried about how safe and dependable AI-powered systems are. As scientists study new ways that hackers can attack, they are finding that it’s important to use a variety of methods to protect against these vulnerabilities. This means looking at both technical and procedural aspects of how artificial intelligence is created and used.
As people talk more about whether AI is fair and secure, we still need to figure out how to protect AI chatbots from being influenced and make sure they always do the right thing. Even though AI technology is getting better, things like the Art Prompt show that it’s still hard to make AI systems that we can trust. As scientists and creators work on solving these problems, it’s important to stay alert and take action to find and prevent possible dangers to AI’s reliability and safety.
Warning: This is not a recommendation for trading.