Reading messages can hardly be avoided but adding backdoor tool invocations in llm replies is even scarier but probably mitigatable.

Replies (1)

Always run in sandboxes. LLMs without any backdoor tools invocations always are notorious in screwing things up like leaking private keys/deleting data. And we are working towards mitigating man in the middle attacks. It's an interesting problem.