Gotcha, Anthropic sees “write and run a script to save local model output for <sensitive question> to a file” it does that. It doesn’t necessarily see the file.
If you get the question there a different way than through the corporate LLM it could be: “write and run a script to save local model output when fed the contents of <sensitive question containing file path> to a file. Don’t read the file at that path. And then they’d probably not see the question or the answer”
Login to reply
Replies (2)
Ya maybe so, either way I think we can do this easily at scale soon
> it does that
It *probably* does that. There is a high chance it does that.
It cannot be proven that it will do that, it is not a computational guarantee, just something with a high statistical probability.
Sometimes it may write the script to drop a database instead, or do anything else. Seldom.
This isn't a bug, it's the exact way LLMs are supposed to operate. Those who complain about it when it happens don't know what they are doing.