Thread - Nostr Hypermedia

franzap fran@zapstore.dev 1 week ago

Checks out

Replies (6)

HoloKat kat@x21.social 1 week ago

So accurate based on what we typically hear and experience first hand

HoloKat kat@x21.social 1 week ago

It how it be

franzap

Checks out

View quoted note →

3 replies ↓

Kieran kieran@snort.social 1 week ago

I fucking knew it

Anton 1 week ago

We need sovereign benchmarks that remain novel over time and/or are updated after they become training data.

The BTC Philanthropist TheBTCPhilanthropist@BitcoinNostr.com 1 week ago

Dang

Leo Wandersleb leo@nostr.info 1 week ago

It would make sense to squeeze a model as much as possible. The early beta-testers get plenty of compute - maybe some extra "thinking" - and later, when the flood gates open, the model gets tuned to almost satisfy most of the users. Hard to test what's going on when not even the provider knows why an LLM produces the reply it produces.