Open weights
media r/LocalLLaMA · 2d ago

Idea for running GLM2 at decent quant with GPU and DDR3 setup

The user proposes using four 5060 Ti GPUs with 64GB VRAM total, running at PCIe Gen 3, to run GLM2 at a reasonable quantization level. They suggest adding 512GB of DDR3 RAM in a server with 16 PCIe lanes and 4x4 bifurcation to offload KV cache storage, aiming for efficient inference without relying on unified memory clusters. The setup is estimated to cost around $1700 total, with potential viability for GLM2 at a decent quant level.

media Hugging Face Forums · 3d ago

Seeking Indic Document Datasets for AI/OCR Training in India

QuantVectors is seeking annotated document datasets in Indic languages from India, including Hindi, Marathi, Gujarati, Bengali, Punjabi, Tamil, Urdu, Telugu, Odia, Kannada, Malayalam, and Assamese. The datasets must include invoice, receipt, utility bill, payment advice, packing list, commercial invoice, and credit note types, with approximately 400 documents per language, human-verified annotations, and 99%+ accuracy. Datasets must be commercially licensable and can be open-source or commercial, with a request for HuggingFace datasets, research datasets, or vendors specializing in this space.

media Hugging Face Forums · 3d ago

Small-scale debug comparison of OLMo-core with Engram graft

A 200-step training comparison between a base OLMo3 600M model and a version with a DeepSeek-style Engram graft shows lower training and evaluation loss, faster grad-norm stabilization, and improved early learning behavior. The Engram graft, injected into layers 1 and 5, increases trainable parameters to ~1.7B but maintains only a 40k increase in active parameters per token, indicating efficient memory usage.