Not yet. You can use quantized models and get OK performance on a good machine but if you're looking to replace something like Claude you're still probably a year or two out on a high end machine running models from a year or two ago.
I built a machine just for LLMs and its good enough as a search engine replacement but way too slow for coding or other highly complex tasks.
Login to reply
Replies (1)
I just got a beefy desktop from framework (128gb vram) but i can't even get OK performance. I am not sure if I am expecting too much or just a n00b when it comes to self-hosted AI