o3 isn't as good as I hoped, but it's still an increment in the SOTA.
69% on SWE-Bench Verified! The regression line over the past 2 years still points to 100‰ by year end!
Frankly I think the real story is how cheaply Gemini 2.5 is delivering 64% on SWE-Bench
Exciting times! Coding with Gemini 2.5 is so satisfying, a big step up from deepseek V3.1, which is what I was using before.
#ai #llm #o3
Joe Resident
npub15sas...8xgu
Working on a gardening robot called Wilbur; we need to give the power of AI to individuals or the next 30 years could be really ugly
🔥IT'S OVER🔥: Clickbait Titles Just Broke The Internet, Viewers Retaliate By Withholding Attention!
#youtube
What would be the point of living inauthentically?
Winning wouldn't be winning. It would be like getting a participation trophy in an amateur acting competition. All the effort, all the existential striving, all the pain, all for a trophy you didn't even care about. And then you die.
Expectations, upbringing, fear, so many things cloud the path to living in native authenticity. But it doesn't require magic to cut through it all. Just honesty, and a choice.
<insert confidently stated opinion about how this will completely revolutionize, or completely trash, the future of a thing you care about>Vibe coded my first #nostr #bot today