I have been finding AI really useful for coding, but this is a good reminder that AI benchmarks are not representative of reality, and developers cannot be trusted to accurately estimate their work, even after completing it ๐

Bluesky Social
METR (@metr.org)
We ran a randomized controlled trial to see how much AI coding tools speed up experienced open-source developers.
The results surprised us: Develo...