Open weights — korshunov.ai

Open weights Page 1 / 11

Multi-Agent Audit Framework for Clinical Mental Health Screening

A multi-agent audit framework improves clinical mental health screening by decomposing reasoning into perception, retrieval, inference, and audit stages. Evaluated on the DAIC-WOZ dataset, it reduces PHQ-8 depression severity prediction error from 5.35 to 5.02 and offers interpretable, verifiable diagnostic rationales.

media r/LocalLLaMA · 2d ago

Gemma 4's Potential to Outperform Mistral and Qwen3.6 Through Finetuning

Gemma 4 shows strong base performance and unique features like global MTP support, QAT, and out-of-the-box vision capabilities. While it currently lacks widespread finetunes, models like MeroMero, Equinox, and Gembrain have already demonstrated high quality, suggesting that with community effort, Gemma 4 could surpass Mistral or Qwen3.6 in specific tasks like coding and creative writing.

github llama.cpp · 2d ago

llama.cpp Release b9763 Adds ID to Tool Call Responses

llama.cpp version b9763 introduces an ID field in tool call responses. The release includes binaries for macOS, Linux, Android, Windows, and openEuler across multiple architectures and hardware acceleration options, with a UI component also available.

media r/LocalLLaMA · 2d ago

Idea for running GLM2 at decent quant with GPU and DDR3 setup

The user proposes using four 5060 Ti GPUs with 64GB VRAM total, running at PCIe Gen 3, to run GLM2 at a reasonable quantization level. They suggest adding 512GB of DDR3 RAM in a server with 16 PCIe lanes and 4x4 bifurcation to offload KV cache storage, aiming for efficient inference without relying on unified memory clusters. The setup is estimated to cost around $1700 total, with potential viability for GLM2 at a decent quant level.

media Don't Worry About the Vase · 3d ago

GLM-5.2 Is the New Best Open Model

GLM-5.2 achieves benchmark scores near frontier levels, matching Opus 4.7 in text-only tasks and ranking among the top open models on multiple tests. It is the strongest open model currently available, outperforming predecessors and rivals like GPT-5.5 and Fable, though it falls short on specialized benchmarks like anti-sycophancy and has limited vision capabilities.

media Hugging Face Forums · 3d ago

BenchHub Ships Major Update to Open Leaderboard Space

BenchHub has released a major update to its open leaderboard platform, now covering vision, audio, and NLP tasks with consistent metrics and reproducible scoring. The platform features 95 boards, 700+ model submissions, and allows free participation via sign-in with GitHub, Google, or Hugging Face, with full exploration and sample comparisons available at runbenchhub.com.

media r/LocalLLaMA · 3d ago

TMax: A Simple Recipe for Terminal Agents

TMax introduces TMax-15k, a dataset of 14,600 RL environments, over 2.5× larger than the next-largest open terminal dataset. It also presents a simple RL recipe that trains open models from 2B to 27B parameters, with TMax-9B achieving 27.2% on Terminal Bench 2.0 and TMax-27B reaching 42.7%.

lab Hugging Face Blog · 3d ago

PP-OCRv6 Released on Hugging Face with 50-Language Support

PP-OCRv6, a new optical character recognition model, is now available on Hugging Face. It supports 50 languages and scales from 1.5 million to 34.5 million parameters, offering improved accuracy and efficiency across diverse languages.

media r/LocalLLaMA · 3d ago

GLM5.2 runs at 7tg on 4x 3090s with 192GB DDR5 on budget build

A user shares their home lab setup with four GeForce 3090 GPUs and 192GB of DDR5 RAM overclocked to 5600 MHz. They run GLM5.2 at 7 tera-giga (tg) as a planner, MiniMax 2.7 at 45tg in VRAM for coding, and Qwen3.6 27B at q8 for testing, all on consumer-grade hardware due to cost considerations.

media Hugging Face Forums · 3d ago

Seeking Indic Document Datasets for AI/OCR Training in India

QuantVectors is seeking annotated document datasets in Indic languages from India, including Hindi, Marathi, Gujarati, Bengali, Punjabi, Tamil, Urdu, Telugu, Odia, Kannada, Malayalam, and Assamese. The datasets must include invoice, receipt, utility bill, payment advice, packing list, commercial invoice, and credit note types, with approximately 400 documents per language, human-verified annotations, and 99%+ accuracy. Datasets must be commercially licensable and can be open-source or commercial, with a request for HuggingFace datasets, research datasets, or vendors specializing in this space.

media MarkTechPost · 3d ago

Tutorial on Building Python-First Interactive Dashboards with Prefab

This tutorial demonstrates how to create interactive dashboards in Python using Prefab's component-based UI framework. It generates synthetic pipeline data, integrates reactive controls like charts, forms, and tabs, and exports the app as a static HTML file for direct preview in Google Colab.

media Hugging Face Forums · 3d ago

NOVA-VAD beats Silero, Pyannote, and WebRTC on noisy audio with 93% accuracy

NOVA-VAD, a lightweight and explainable Voice Activity Detector, achieves 93% accuracy on noisy audio from the UrbanSound8K dataset, outperforming WebRTC (58%), Pyannote (62%), and Silero (87%). It uses only scikit-learn, requires no GPU, and provides feature importance and confidence scores in plain English.

media Hugging Face Forums · 3d ago

Small-scale debug comparison of OLMo-core with Engram graft

A 200-step training comparison between a base OLMo3 600M model and a version with a DeepSeek-style Engram graft shows lower training and evaluation loss, faster grad-norm stabilization, and improved early learning behavior. The Engram graft, injected into layers 1 and 5, increases trainable parameters to ~1.7B but maintains only a 40k increase in active parameters per token, indicating efficient memory usage.

media r/LocalLLaMA · 4d ago

I pretrained and post trained a 500M parameter LLM and 330M parameter Image generator from scratch

The author pretrained a 500M parameter language model and a 330M parameter image generator from scratch using 40B tokens from fineweb. The image generator was inspired by ByteDance's DreamLite architecture and trained on a mixture of datasets from MidJourney, Flux, and CCW3.

media r/LocalLLaMA · 4d ago

Can I realistically get close to Claude/Codex capabilities locally?

A user with a 32GB system asks if open-weight models can match Opus 4.8's 1M context and coding performance on local hardware. They note current bottlenecks are context length and privacy concerns, and question whether high-end models like GLM 5.2 or Qwen3.7 are feasible within a $3.5K budget, emphasizing that running 70-80B models offers marginal real-world gains over 27B models with 256K context.

media r/LocalLLaMA · 4d ago

GLM-5.2 Beats Gemini and GPT-5.4 in Coding but Is Inefficient

GLM-5.2 surpasses GPT-5.4 and the entire Gemini lineup in coding performance on the DeepSWE benchmark. However, it requires significantly more output tokens, making it substantially less efficient in terms of cost-per-task compared to models like GPT-5.5 and Claude Opus 4.8.

media r/LocalLLaMA · 4d ago

Qwen 3.7 Will Not Be Open Sourced

Following the departure of Junyang Lin, Qwen has ceased open sourcing its models. As of June 2026, all major Chinese AI labs except Qwen have released open source models more recently than Qwen 3.7, which remains fully closed source.

media r/LocalLLaMA · 4d ago

Best open-source vision model runnable on RTX 6000 Pro

The user is seeking the current best open-source vision model that can run on an RTX 6000 Pro for OCR and classification of historical scanned documents. They note Gemma 4 31B performs well and is better than Qwen 3.6's vision encoder, asking for recommendations beyond this model.

media r/LocalLLaMA · 4d ago

What are people doing with their local models and what tools do they use?

A user asks about practical applications of local models and which tools are effective for tasks like coding, particularly as alternatives to web-based interfaces like Claude.ai. They mention trying OpenWebUI but find it underpowered without significant customization.

media r/LocalLLaMA · 4d ago

What happens when LLM subscriptions stop being subsidized?

LLM providers currently subsidize expensive API usage to build ecosystems, planning to raise prices later. As subsidies dwindle, users may face steep price increases—like $2k per month—making access costly and threatening widespread adoption, especially for individuals relying on affordable hardware to run models.