Your best local LLM for low-VRAM (6GB)?

sp3ctre@feddit.org · 5 days ago

Your best local LLM for low-VRAM (6GB)?

Denixen@feddit.nu · 5 days ago

My setup is a laptop with 8 GB vram and 16 gb ram.

I have been using ministral 3b (fast) and 14b (slower but somewhat smarter/capable) via ollama. They work remarkably well considering how small they are.

I have been using it as a text translator, summarizer and assistant for discussing more basic things, including integrating it in pycharm using the ollama assist plugin as a coding assistant.

For autocomplete in pycharm I have to use llama 3.1 8b, since ministral cannot do autocomplete (?).

I can recommend ministral, Mistral are really great at creating small distilled models that have a lot of bang for the parameters they have.