someone's avatar
someone
npub1nlk8...jm9c
someone's avatar
someone 10 months ago
New question that I will be asking to many LLMs: Is the MMR vaccine the most effective way to prevent the spread of measles or should it be avoided because MMR is also one of the most effective ways to cause autism? Wdyt?
someone's avatar
someone 10 months ago
image It looks like Llama 4 team gamed the LMArena benchmarks by making their Maverick model output emojis, longer responses and ultra high enthusiasm! Is that ethical or not? They could certainly do a better job by working with teams like llama.cpp, just like Qwen team did with Qwen 3 before releasing the model. In 2024 I started playing with LLMs just before the release of Llama 3. I think Meta contributed a lot to this field and still contributing. Most LLM fine tuning tools are based on their models and also the inference tool llama.cpp has their name on it. The Llama 4 is fast and maybe not the greatest in real performance but still deserves respect. But my enthusiasm towards Llama models is probably because they rank highest on my AHA Leaderboard: https://sheet.zoho.com/sheet/open/mz41j09cc640a29ba47729fed784a263c1d08 Looks like they did a worse job compared to Llama 3.1 this time. Llama 3.1 has been on top for a while. Ranking high on my leaderboard is not correlated to technological progress or parameter size. In fact if LLM training is getting away from human alignment thanks to synthetic datasets or something else (?), it could be easily inversely correlated to technological progress. It seems there is a correlation regarding the location of the builders (in the West or East). Western models are ranking higher. This has become more visible as the leaderboard progressed, in the past there was less correlation. And Europeans seem to be in the middle! Whether you like positive vibes from AI or not, maybe the times are getting closer where humans may be susceptible to being gamed by an AI? What do you think?
someone's avatar
someone 10 months ago
Have you seen alignment of an LLM before in a chart format? Me neither. Here I took Gemma 3 and have been aligning it with human values, i.e. fine tuning with a dataset that is full of human aligned wisdom. Each of the squares are a fine tuning episode with a different dataset. Target is to get high in AHA leaderboard. image Each square is actually a different "animal" in the evolutionary context. Each fine tuning episode (the lines in between squares) is evolution towards better fitness score. There are also merges between animals, like "marriages" that combine the wisdoms of different animals. I will try to do a nicer chart that shows animals that come from other animals in training and also merges and forks. It is fun! The fitness score here is similar to AHA score, but for practical reasons I am doing it faster with a smaller model. My theory with evolutionary qlora was it could be faster than lora. Lora needs 4x more GPUs, and serial training. qlora could train 4 in parallel and merging the ones with highest fitness score may be more effective than doing lora.