Interesting paper I hadn't seen, the 'Densing Laws' of LLMs: they are getting twice as capable for the same size model every 3.3 months.
Qwen 3 released today may be an emphatic continuation of the trend. Need to play with the models more to verify, but the benchmark numbers are... Staggering. Like 4 billion handily beating a 72 billion model from less than a year ago

Densing Law of LLMs
Qwen
Qwen3: Think Deeper, Act Faster
QWEN CHAT GitHub Hugging Face ModelScope Kaggle DEMO DISCORD
Introduction Today, we are excited to announce the release of Qwen3, the latest additi...
