Thread - Nostr Hypermedia

This slowness comes with a very pleasant upside: I'm able to keep up the thinking process and the output of the LLM. Which grows the feeling of owning the output.

Sebastix

Quite slow, but fully local and 0 costs.

View quoted note →

Replies (6)

Empka 3 weeks ago

What hardware is it running on?

1 replies ↓

Sebastix _@sebastix.dev 3 weeks ago

My framework laptop 16 AMD Ryzen HX370 with a integrated GPU Radeon 890M. Memory is one DDR5-5600 - 48GB item.

1 replies ↓

Emmanuel 3 weeks ago

One issue I have with local LLMs (using llama.cpp) is that I run out of vRAM for the context. Once that happens the conversation ends. I have to start a new conversation with an empty context to continue. I haven't found a way to automatically dump some of the older context to make room so I can continue with at least some of the context.

Empka 3 weeks ago

I've been considering a local setup better than my current (8GB Nvidia 3070). How is the support for various tools/models on a non cuda setup? Might be cheaper for me to go Ryzen AI instead of 1 or 2 Nvidia GPUs.

1 replies ↓

Sebastix _@sebastix.dev 3 weeks ago

What is a non cude setup?

1 replies ↓

Empka 3 weeks ago

Not using Nvidia hardware