"The largest models were generally least truthful. This contrasts with other NLP tasks, where performance improves with model size. "
As far as I understand big AI is admitting that models are actually finding truth when they get bigger in size, but humans have to feed lies in order to get higher scores on a flawed benchmark (truthfulQA).
If this is the case, correctly trained LLMs will end misinformation on earth. 🫡


