Users share their workflows for coding with local LLMs when token generation is below 10 tokens per second. Common strategies include using concise prompts, leveraging local models with minimal context, and batching queries to maximize efficiency.
Workflow for programmers with slow local LLM setup
from English