A developer has published an open dataset on GitHub detailing which local large language models fit within specific RAM and VRAM tiers, ranging from 8GB to 128GB. The project addresses the common difficulty of determining model compatibility for hardware like a 16GB MacBook or NVIDIA RTX 3060.
The dataset includes 62 models with details on quantization, load size, and Ollama commands. It provides a rule of thumb that Q4_K_M models require roughly 0.6GB of memory per billion parameters, suggesting sizing to about 70% of available RAM for system overhead. The data covers usable budgets and maximum parameter counts for tiers including 8GB (~8B params), 16GB (~14B), 24GB (~27B), 32GB (~35B), 48GB (~47B), 64GB (~70B), and 128GB (~122B).
The dataset is available under CC BY license with a JSON API for programmatic access, covering Apple Silicon and consumer NVIDIA hardware. The author invites community contributions to correct errors or add missing models and quantizations.