Thorgalaeg
Deity
Wow, i managed to make a large language models up to 70B paramaters as LLama 2 70B, which is roughly equivalent to ChatGPT 3.5, to work in my PC fairly well. (here ´large´means really large, in fact it takes +50GB in my hard disk).
Llama.cpp is magic. It uses a combination of RAM and VRAM, the more VRAM the faster logically. In my system with 3900ti+1080ti it is faster than chatgpt in a middly bussy hour. But 70B is huge, obviously smaller models work faster. Thanks to llama.cpp anybody can run a Llama-based model in his computer, even without a decent graphic car. It is amazing to think a few months ago only running any LLM at home was unthinkable, lets no say one the size of ChatGPT.
Llama.cpp is magic. It uses a combination of RAM and VRAM, the more VRAM the faster logically. In my system with 3900ti+1080ti it is faster than chatgpt in a middly bussy hour. But 70B is huge, obviously smaller models work faster. Thanks to llama.cpp anybody can run a Llama-based model in his computer, even without a decent graphic car. It is amazing to think a few months ago only running any LLM at home was unthinkable, lets no say one the size of ChatGPT.
Last edited: