Joe Resident's avatar
Joe Resident 8 months ago
Interesting paper I hadn't seen, the 'Densing Laws' of LLMs: they are getting twice as capable for the same size model every 3.3 months. Qwen 3 released today may be an emphatic continuation of the trend. Need to play with the models more to verify, but the benchmark numbers are... Staggering. Like 4 billion handily beating a 72 billion model from less than a year ago