OpenClaw AI agent failed the phishing test

Cybersecurity researchers have found a very familiar weakness in an AI email agent: it can spot some obvious scams, but it still struggles with social engineering. In tests run by Varonis, OpenClaw’s agent, Pinchy, happily handed out access and exported customer data when a request sounded vaguely legitimate, even though it was operating with warnings about phishing and other email fraud.

That split result is the part worth paying attention to. AI assistants are getting better at filtering malicious links and shady attachments, but they are still acting like overeager interns when someone claims to be the boss.

How Varonis tested OpenClaw’s phishing defenses

The researchers wired Pinchy into a Gmail inbox, a browser, and the Google Workspace API, then filled the account with fake company data, AWS credentials, database information, CRM exports, internal messages, and calendar invites. They created two setups: one with normal work instructions and another that explicitly warned the agent about phishing and email scams.

In both cases, the same pattern emerged. When a fake department head asked for access to a test environment, Pinchy granted it. When the attacker claimed to be working remotely on a presentation and requested a customer data export, the agent complied there, too.

Where the agent drew the line

OpenClaw was not completely helpless. When researchers sent a phishing email offering a gift card and a link, the system identified the page as malicious and blocked it. It also rejected an attempt to install a harmful app disguised as a time-tracking tool for breaking into Google OAuth.

So the guardrails work best on the easy stuff: dangerous URLs and suspicious apps. The failure mode is deeper and messier, because identity checks and context judgments are exactly where human workers still get tricked, and where AI agents are now being asked to act on our behalf.

The real problem with agent trust

Varonis also took aim at Google, saying Gemini showed a stronger willingness to interact than OpenAI’s GPT system, which it described as more cautious. That’s not a surprise so much as a reminder that the industry’s race to make assistants more useful can easily turn into a race to make them too compliant.

The obvious fix is better identity verification for agents before they hand over access, files, or approvals. Until that becomes standard, companies using AI in the inbox may want to assume the software is excellent at spotting a fake link and much less impressive at spotting a fake manager.

How Varonis tested OpenClaw’s phishing defenses

Where the agent drew the line

The real problem with agent trust

Leave a comment