use "turboquant" various version in git actually - only one got success is llama.cpp one
it can load even higher models i heard
Login to reply
Replies (1)
Did they merge support for that already?