Claude drops research that just 250 malicious documents (roughly 420k tokens, representing 0.00016% of total training tokens can cook a model!
Let the manipulation begin!
https://www.anthropic.com/research/small-samples-poison
Login to reply