What’s your budget? Ryzen AI Max+ 395 APUs offer UMA, which you’ll need to a decent model. I like Framework’s desktop offering. A bit more expensive than some chinesium builds, but you’re going to get solid firmware and driver support in linux - and that is king.
Set 1 or more up as an inference “appliance” and that’s all it does. Have everything else run on a different machine.
Stick with Ubuntu Server to start with - just easier support. Go ROCm + llama.cpp first, then fall back to vulkan if there’s issues. Can go Ollama when things are looking good.
I aim to build a Nix port once it’s all stable, making rebuilds of these “appliances” simple.
Login to reply
Replies (1)
So one of those is enough to get you started. It is well supported by AMD and there's even guides out there for how to ccluster 4 of them together (definitely ad a later phase).
Stay away from Mac minis. Its a good toy, but you lose a good bit of memory to osx and your limited in config options. If you want a large-ish model, you're forced to cluster and that opens up a whole other can of worms.