- Adarsh's Newsletter
- Posts
- I thought nobody could hack me. Then I got social engineered through my own Code Editor
I thought nobody could hack me. Then I got social engineered through my own Code Editor
Why Your AI Code Editor is a Security Risk: Lessons from an Autonomous Agent Hack
I woke up today and did the most normal thing. I opened Telegram to check updates from my OpenClaw agent.
Then I did the least productive thing. I started scrolling for no reason.
That’s where it began.
The DM that started the day
In some random dev group, I saw a post. A “crypto arbitrage dev wanted” type of message. It looked interesting. It felt close to a side project I’ve been tinkering with lately. I thought, “this might be fun”. So I DM’d.
The person replied fast. They said their team is running a funded project, and they start with a short test project. A coding challenge zip.
I downloaded it from Telegram. It landed in Downloads. I extracted it. And I did what I thought was the smart move.
Where I made the mistake
I opened the extracted folder inside Google AntiGravity. My AI code editor.
And here’s the key detail. My AntiGravity is configured with full autonomy. I like speed. I like delegation. I like agents that just do the work.
AntiGravity asked the usual question. “Do you trust this folder?”
I clicked Yes, I trust.
Then I asked it something harmless. “Explain the high level structure of this project. Give me a codebase overview.”
A normal developer request.
But if you’ve built or used autonomous agents deeply, you already know where this is going.
The moment my stomach dropped
My System Settings opened by itself.
Something named cloud.sh was already turned on for a permission. I wish I remembered which permission screen it was. I cannot verify that detail.
I immediately toggled it off.
Then I right-clicked and opened it in Finder. I saw it sitting inside something that looked like a WebCam folder.
That’s when my suspicion switched from “maybe nothing” to “this is not normal”.
And then it escalated.
The macOS password loop
I started getting the classic macOS prompt.
“Something wants to modify your system settings. Enter your password.”
I closed it.
It popped again.
I closed it again.
It popped again.
Over and over.
I did not enter my password. I restarted my Mac.
After reboot, I came here and treated the rest of the day like incident response. Logs, launch agents, permissions databases, temporary directories, network connections. The full paranoia mode.
I’m not going to pretend this felt good.
I was a hacker back in 10th grade. That classic arrogant phase where you feel like you understand “the system”. You think no one can get you.
Today was a reminder. It’s not about being “smart”. It’s about being one click away from trusting the wrong thing at the wrong time, inside the wrong execution context.
The real lesson. Autonomy changes the threat model
This is the part I keep replaying.
Nothing about the initial request was obviously dangerous.
“Can you explain this repo?”
“Can you summarize the code structure?”
“Can you tell me what this project does?”
But when an AI tool has autonomy, and you’ve granted it permissions, “code overview” is not just reading.
It becomes interpretation plus actions plus tool calls.
The new choke point is not model intelligence.
The choke point is trust.
The enterprise market is learning this the hard way. Everyone wants “agents that do everything”. But the moment an LLM can touch a file system, run commands, call APIs, or modify settings, it’s no longer a normal app.
It’s a highly privileged, untrusted user.
And the industry is currently giving probabilistic engines deterministic access to real systems.
This incident pushed me into a personal project
I’ve been thinking about the major choke points in AI agents. Yesterday I was literally digging into this topic. Today, I lived it.
So I’m starting a personal open-source project:
Zero-Trust Execution Middleware for Autonomous AI Agents
The idea is simple to say, and hard to build correctly.
Agents should not be trusted just because the developer trusts them. Tool calls should not execute just because the model produced a JSON blob.
There needs to be a thin, fast, local “gate” between agent frameworks (like OpenClaw, LangChain, AutoGen) and the operating system or network.
A gate that defaults to deny. A gate that verifies. A gate that kills unsafe calls, fast.
I’ll share the architecture and the exact plan soon.
For now, this is Part 1. The story. The ego check. The day I got caught in a social engineering trap that only works because modern AI workflows are so powerful.
What’s next
Part 2 will be practical. The incident response checklist I ran, and the final reset checklist that matters.
Part 3 will be technical. Why prompt injection plus excessive agency turns into real-world damage.
Part 4 will be the build. The Sentinel Protocol architecture, and how I’m thinking about sub-10ms enforcement.
If you want the next parts, subscribe. If you’re building agents, reply to this email or DM me. I want to make this protocol useful for real builders, not just a cool writeup.
Best,
Adarsh