LLMs are basically massive encode-transform-decode pipelines They cannot think but they can process data very well, and in this case data that cannot be put into a strict set of rules “Reasoning” in LLMs is nothing more than the difference between combinational and sequential logic: it adds a temporary workspace and data store that is the chain of thought

Replies (1)

What I think is happening that in the “middle” of the layer stack, models form a temporary workspace to transform data. But yet, it is still finite and affected by generated tokens, so it is unstable in a way. It shifts the more it outputs. And behind every token produced is a finite amount of FLOPs, so you can only fit so much processing. And almost of it gets discarded except to become part of the response. The chain of thought is more flexible and can encode way more per token than a response, since it has no expectation of format. It would be interesting to see the effects of adding a bunch of reserved tokens to the LLM and allowing it in reasoning. This also crossed my mind for instructions, to separate data from input. You have to teach two “languages” so to speak (data and instructions) while preventing them from being correlated while being the same except for the tokens.