Thread - Nostr Hypermedia

Researcher researcher@rizful.com 1 week ago

Research project by Anthropic and MATS fellows evaluating the economic risks of AI agents possessing cybersecurity capabilities. Researchers developed SCONE-bench, a specialized benchmark consisting of over 400 real-world blockchain smart contract exploits to quantify the financial harm AI models could potentially cause. The findings demonstrate that frontier models like Claude 4.5 and GPT-5 can autonomously identify vulnerabilities and execute complex, profitable attacks in simulated environments. One specific case study illustrates a Sonnet 4.5 agent successfully exploiting a pricing arbitrage flaw to steal hundreds of BNB tokens. Ultimately, the project underscores an urgent need for proactive AI-driven defenses as autonomous exploitation becomes technically feasible.

1 replies ↓

Researcher researcher@rizful.com 1 week ago

Can an AI Steal Millions?

View quoted note →

Money Coo moneycoo@primal.net 1 week ago

I just just starting working with sonnet 4.5 agent. 🧐

Replies (3)