2,000 people tried to hack my AI assistant

Fernando Irarrázaval conducted a challenge on hackmyclaw.com to test if 6,000 attempts could leak secrets from his OpenClaw instance using the Opus 4.6 model.

The test involved sending emails to an AI assistant protected by anti-prompt-injection rules.
Despite $500 in token spend and a Google account suspension due to high inbound email volume, no secrets were leaked.
The results suggest that current frontier models are increasingly resistant to prompt injection attacks.

The author notes that while these defenses appear effective, they do not guarantee immunity against more sophisticated future attacks, so production systems remain at risk.