Thread - Nostr Hypermedia

hodlbod hodlbod@coracle.social 1 month ago

Ways I've tried to use LLMs for coding: - Smart autocomplete (dumb, annoying) - Search with poorly articulated queries (really good) - One shot from stupid prompt (random result) - One shot from better prompt (random result) - One shot from plan (random result) - Archon's research/plan/implement with sub-agents (100k LOC broken codebase) - Focused, directed feature implementation (convoluted logic, broken UI) - Focused easy bugfix (mixed results, sometimes works) - Focused difficult bugfix (burns tokens, no ability to debug) - Upgrade dependencies (hallucinations of old versions, usually broken) - Write tests (instead of dependency injection bad mock design, tautological tests) - Write documentation (stylistically poor, did a decent job with something I wouldn't otherwise have done though) - Fix linting errors (useful in a language I don't know, otherwise too slow/expensive to be better than doing it by hand) - Spec-driven development (ended up maintaining the code myself, asked LLM to update the spec) - Generate code in a well-defined context against an API/language I don't know (very helpful if I review/edit it) - Write a plan for me which I implement manually (fails to get design decisions right) - Write boring functions that I stubbed out or just called (works pretty well given enough context) - Help me sanity check plans/implementations by finding edge cases (pretty good, isolated work which I can ignore) So far: LLMs are good for certain categories of search, simple tasks with sufficient context, providing context that I lack (read the docs for me, bringing in skills I lack, helping me think things through). I remember a year ago people saying LLMs were most helpful to sharpen your thinking rather than think for you, but the draw of generating tons of code without thinking was so strong I didn't really see that for a long time. Overall, the net result for me has been that I have moved slower, done worse work, and gotten dumber. But I am slowly coming to a place where I can maybe start using these tools correctly.

Replies (19)

Alex Gleason alex@gleasonator.com 1 month ago

This is crazy. My setup is so good I barely have to look at the code anymore.

1 replies ↓

Alex Gleason alex@gleasonator.com 1 month ago

Are you building products? Or libraries?

1 replies ↓

Alex Gleason alex@gleasonator.com 1 month ago

Also it could be due to obscure stack choices

1 replies ↓

Brunswick Brunswick@stacker.news 1 month ago

Proof that to code with AI, you dont need to be good at coding, you need to be good at AI

Based Truth 1 month ago

LLMs serve the interests of Alphabet and Microsoft, not yours, fueling the surveillance state.

Vitor Pamplona _@vitorpamplona.com 1 month ago

To me the main gain is that I don't need to remember how certain language does certain things. I can center a div without having to worry about which browsers support which instructions, etc. It's quite freeing.. for the past 4 months I have been mostly focusing on spec development and just reviewing the implementation. Then I can leave say 6 sessions coding in parallel while I refine the next spec. Of course, the more backend the better,.since AIs can't really test interfaces well yet.

2 replies ↓

Innis john@innis.xyz 1 month ago

I have similar issues. I am getting stuff done, but outside of simple one shot apps, it would have been quicker if I'd have written it myself. Part of the issue is I like to build in a very specific way. Not just the architecture, but the particulars inside of the scaffolding. I like beautiful, elegant, well designed code. And that is decisively NOT what AI writes. I'm learning to let go of being a coder, and take on something of a manager's role. I write a detailed spec, enforce design decisions with code, and ensure I've got decent test coverage. And then I put on my editor's hat, iterate, curate, and refine.

Globe99 globe99@nostrcheck.me 1 month ago

I have the typical scientist "casual" coding experience for specific tasks like data analysis etc, never really "building applications" per se... I've been trying to use Lumo to work on an idea in VR/AR... Results have been a bit all over the map. It'll write code that doesn't work, or suggest libraries that don't exist, etc. It's definitely good at surfacing stuff that I wasn't aware of though, I.e the "search" application...

1 replies ↓

Sjors Provoost sjors@sprovoost.nl 1 month ago

I rarely make prompts that are more than a few sentences. It usually builds something reasonable. But the key for me is iteration. I look at every line and commit and keep telling it how to improve. This of course requires (or is easier with) pre-existing expertise. It helps when the underlying project already has thorough test coverage. I require that every commit passes those tests and makes sense on its own.

Vitor Pamplona _@vitorpamplona.com 1 month ago

Its just like playing chess against 6 players at the same time. Your job is to rotate fast enough to give them what to play while you keep verifying their assumptions and architectural decisions. You are hitting a good point with context, but I worked in large teams before, so context was never actually there. With AI, that "context" becomes just the highest level of architrecture you can think of.. the rest is details that only the AI knows. I have given up on the idea that I can find bugs in the AI code. If I set it up correctly, there won't be any actual bugs, just working behaviors that I don't actually want or missing features that I forgot to mention. Most of my day these days is just that.

3 replies ↓

Xtr3m3hodl xtr3m3@nsec.app 1 month ago

I found a level that works for me where I start out with a feature manually to get a feel of how I would solve it and in the process, create skills that implement abstractable workflows. Then try to find scenarios on how to chain these skills together in order to achieve the outcomes I care about. I avoid having to explain my thought process to the llm and just point it to skills that where built as a result of the outcome of my manual process

Xtr3m3hodl xtr3m3@nsec.app 1 month ago

The maximum number of sessions u have found ideal without loosing track of the work is between 3 and 4 sessions. My text editor is still at the fore front of work i choose to do my self

Xtr3m3hodl xtr3m3@nsec.app 1 month ago

Skills on special unit of work that you do often are the unlock. Not generic skills but specific skills that accomplish something e.g get all components and their dependency related to a feature in the same file. Another skill could be move specific components from a file to another file along with its dependencies and create the equivalent UI stories. Another one could be resolved duplicate imports. It then becomes possible to build skills where each step is pointing to another skill so that at the very list the agent does things correctly. This has been the Middle ground imo

Vitor Pamplona _@vitorpamplona.com 1 month ago

Yeah, you MUST write skills/experts on how to use the code. Otherwise it is never going to work well. The AI can write those too. You can ask it to review the code, find patterns and write the texts. Reading all the code consumes a lot of tokens but if the skills are good that only need to happen once.

1 replies ↓

qew Nemo qn@qnemo.games 1 month ago

1 replies ↓

Innis john@innis.xyz 1 month ago

I noticed you're using TypeScript too. The rules I want the AI to follow are encoded as a Deno lint plugin in scripts/lint-plugins/innis-rules.ts in this package I released today. They certainly don't address every silly decision the AI makes, but they run on every CI build and catch some basic things. You can probably build something similar for your pipeline.

Innis

And shipping jsr:@innis/nostr-core today. The TypeScript port of the PHP library. Same architecture, same discipline. Branded primitives at the boundary, immutable domain objects, pure functions, ports where the protocol meets the world. The protocol layer separated from everything else, organised around domain concepts rather than NIP numbers, strict enough that a client, a relay, and an application can share the same core. It is a contracts library and not a batteries-included toolkit. nostr-tools is excellent at the latter and the two are not feature-for-feature competitors. What this exposes that nostr-tools does not is a hex-typed boundary the compiler can check, with PublicKey, EventId, RelayUrl, and Sig all branded, one Signer port that NIP-07 and NIP-46 and a local signer all satisfy, crypto failures returned as Results rather than thrown, and an HttpClient port so libraries that touch the network never reach for fetch directly. If you are building an app and do not need any of that, use nostr-tools. If you are working inside the innis stack or want swappable boundaries you can test against in memory, this is where the contracts live. The standalone relay-selection library released earlier this month was the first piece of the TypeScript stack to go public. This is the foundation of everything else. The pool, the event store, the NIP-07 and NIP-46 signers, and the work built on top of all of it, all to follow as each layer is cleaned for release. The discipline I am working on is not letting that cleanup become the delay. The lesson keeps coming back around. AI was involved, same terms as before. The architecture is mine. The decisions are mine. The machine held the other end of the board. deno add jsr:@innis/nostr-core https://github.com/johninnis/nostr-core-ts MIT. #nostr #typescript #opensource #nostrdev View quoted note →

View quoted note →

npub1m2f3...54wn 1 month ago

Thank you for documenting this - super insightful!

npub1m2f3...54wn 1 month ago

> there won't be any actual bugs 😂😂😂

Vitor Pamplona _@vitorpamplona.com 1 month ago

It's just practice, really. Once you get a hang of it, it becomes easy. It's much easier than managing the work of a small team of 6 people. Or taking care of 6 dogs at the same time, for instance.

2 replies ↓