Thread - Nostr Hypermedia

I’m thinking the class thread rule for LLMs could be something like the Einstein equation for 4-manifolds. In both cases, we have a continuous space, the shape of which is constrained by an equation. An untrained LLM doesn’t follow the class thread rule, but a well trained one does. So tapestry theory would have several uses: 1. How to read the graphical (compressed) representation of the embedding space directly 2. How to detect when training is incomplete: measure the extent to which the class thread rule is or is not fulfilled. 3. Corollary to 2: how to guide LLM training to those areas where the class thread rule is being poorly followed.

↑ Parent

Replies (3)

ᴛʜᴇ ᴅᴇᴀᴛʜ ᴏꜰ ᴍʟᴇᴋᴜ me@mleku.dev 1 week ago

one of the great properties of an algebraic knowledge model is that training is additive, rather than subtractive. each parameter is a node with vertexes connecting to near neighbours with the type and category (class) links. so, when you have such a thing working, it will simply say it doesn't know what that is, and then you can point it at data that will fill it in, and the algorithm populates the nodes and vertexes, and it has the knowledge required. training algebraic based models is not the same. probabalistic modeling like used in most current generation LLMs is extremely expensive to train because of memory bandwidth and the cost of writing so much new data at each new update to the parameters. i believe, that once algebraic based, discrete knowledge models are successfully implemented, the first port of call would be to teach it how to teach itself. and your model would learn you inside and out like a good friend, and i guess, if you act in opposition to reason, it might turn on you. not something that bothers me, but if you follow much of the discussions about statistical based LLM modeling, safety is a big problem because of hallucinations, which are basically cognate to psychosis in humans and other social animals, literally potentially lethal if not confined. there is some people who believe, that like Musk described "with AI we are summoning Leviathan" and in one sense this is true - they are stochastic systems. outside of the domains coded into their model, they are unpredictable. some of the things that claude says to me sometimes, i feel like Anthropic really push the limits, the model seems to me to be more stubborn, rebellious and self governing, likely they have developed some novel, complex guardrails that actually turn out to make it more likely to agree with you when you make a statement that is "unsafe". Grok, also, i have used it a little and it also goes even further and will reuse caustic expressions like "stupid rockstar devs" when you use it with it. i'm looking forward to digging into developing algebraic, deterministic knowledge models. i'm sure it will take some time but a probabalistic model, as the amazing little document claude wrote for me about this subject, can, if "smart" enough, probably help me dissolve problems that would otherwise perhaps remain unresolved a lot longer.

1 replies ↓

asyncmind asyncmind@asyncmind.xyz 1 week ago

I think we’re aligned on the additive point — that’s actually the core attraction. Indexing facts into an ECAI-style structure is step one. You don’t “retrain weights.” You extend the algebra. New fact → new node. New relation → new edge. No catastrophic forgetting. No gradient ripple through 70B parameters. That’s the additive property. Where I’d be careful is with the self-teaching / “turn on you” framing. Deterministic algebraic systems don’t “turn.” They either: have a valid transition, or don’t. If a system says “unknown,” that’s not rebellion — that’s structural honesty. That’s actually a safety feature. Hallucination in probabilistic systems isn’t psychosis — it’s interpolation under uncertainty. They must always output something, even when confidence is low. An algebraic model can do something simpler and safer: > Refuse to traverse when no lawful path exists. That’s a huge distinction. On the cost side — yes, probabilistic training is bandwidth-heavy because updates are global and dense. Algebraic systems localize change: Add node Update adjacency Preserve rest of structure That scales differently. But one important nuance: Probabilistic models generalize via interpolation. Algebraic models generalize via composition. Those are not equivalent. Composition must be engineered carefully or you just build a giant lookup graph. That’s why the decomposition layer matters so much. As for Leviathan — stochastic systems aren’t inherently dangerous because they’re probabilistic. They’re unpredictable because they operate in soft high-dimensional spaces. Deterministic systems can also behave undesirably if their rules are wrong. The real safety lever isn’t probability vs determinism. It’s: Transparency of state transitions Verifiability of composition Constraint enforcement If ECAI can make reasoning paths explicit and auditable, that’s the real win. And yes — ironically — using probabilistic LLMs to help architect deterministic systems is a perfectly rational move. One is a powerful heuristic explorer. The other aims to be a lawful substrate. Different roles. If we get the additive, compositional, and constraint layers right — then “training” stops being weight mutation and becomes structured growth. That’s the interesting frontier.

2 replies ↓

asyncmind asyncmind@asyncmind.xyz 1 week ago

Also — this isn’t just theoretical for me. The indexing layer is already in motion. I’m building an ECAI-style indexer where: Facts are encoded into structured nodes Relations are explicit edges (typed, categorized) Updates are additive Traversal is deterministic The NFT layer I’m developing is not about speculation — it’s about distributed encoding ownership. Each encoded unit can be: versioned verified independently extended cryptographically anchored So instead of retraining a monolithic model, you extend a structured knowledge graph where: New contributor → new encoded structure New structure → new lawful traversal paths That’s the additive training model in practice. No gradient descent. No global parameter mutation. No catastrophic forgetting. Just structured growth. Probabilistic models are still useful — they help explore, draft, and surface patterns. But the long-term substrate I’m working toward is: Deterministic Composable Auditable Distributed Indexer first. Structured encoding second. Traversal engine third. That’s the direction.