All articles — korshunov.ai

All articles Page 1 / 130

Le Chaton Fat Flash Local Version Availability

Users express interest in a local, "flash" version of Le Chaton Fat for privacy and sovereignty. The community is asking for updates on when such a lightweight local version may be available.

github llama.cpp · 13d ago

LLaMA.cpp Release b9698 Adds Self-Update Support and Multiple Platform Binaries

LLaMA.cpp version b9698 enables self-updates only when built with llama-install.sh. The release includes binaries for macOS, Linux, Android, Windows, and openEuler across multiple architectures and hardware acceleration options, including Vulkan, CUDA, OpenVINO, and SYCL.

github llama.cpp · 13d ago

llama.cpp Release b9699 Adds SYCL Support and Multiple Platform Binaries

llama.cpp version b9699 introduces support for MUL_MAT and OUT_PROD operations with Q1_0 precision via PR #24721. The release includes precompiled binaries for macOS, Linux, Android, Windows, and openEuler across multiple architectures and acceleration frameworks, including SYCL (FP32 and FP16), Vulkan, CUDA, ROCm, and OpenVINO.

media r/LocalLLaMA · 13d ago

ML Models Recommended for M5 Max MacBook Pro with 128GB RAM

The user asks for model recommendations for their 16-inch MacBook Pro with M5 Max chip and 128GB RAM. They currently run Qwen 3.6 35B a3b via Hermes agent and LM Studio, noting the suitability of MLX models on Apple Silicon.

media r/LocalLLaMA · 13d ago

Keye-VL-2.0-30B-A3B Launches with Advanced Video Understanding and Agent Capabilities

Keye-VL-2.0-30B-A3B is a 30B-parameter multimodal model designed for long-video understanding and agent functionality. It outperforms open-source rivals and matches Gemini-3-Flash in temporal grounding, supports up to 256K context with near-lossless reasoning, and includes built-in capabilities for code, tool, and web search agent workflows.

github llama.cpp · 13d ago

LLaMA.cpp Release b9697: New Binaries and Updates

LLaMA.cpp releases version b9697 with updated binaries for macOS, Linux, Android, Windows, and openEuler. The release includes support for ARM64, x64, Vulkan, CUDA 12 and 13, OpenVINO, SYCL, and ROCm, with a fixed message parsing issue in release checks.

media r/LocalLLaMA · 13d ago

GLM-5.2 Flash Release Date (Joke)

A Reddit user jokes about Z.ai's open-sourcing of GLM-5.2, expressing excitement for a successor to GLM-4.7-flash. The post humorously suggests a model in the 27-120B parameter range would be ideal, though it is presented as a joke.

github AutoGPT · 13d ago

autogpt-platform-beta-v0.6.64 Released

The autogpt-platform-beta-v0.6.64 release, dated 18th June 2026, introduces new features such as the AutoPilot Context Panel and Global Search, along with enhancements to graph saving, caching, and builder performance. It also includes security hardening, bug fixes for LLM provider issues, and UI improvements like a high-resolution touch icon.

github CrewAI · 13d ago

CrewAI v1.14.8a Releases New FlowDefinition Features

CrewAI v1.14.8a introduces script and crew actions to FlowDefinition, adds DMN mode support, and enables flow execution without Python code. It also includes experimental support for JSON-first crews and ZIP deployment fallback, along with improved memory reset and token usage tracking.

media r/LocalLLaMA · 13d ago

Does anyone have enough compute to make a distillation dataset from GLM5.2?

A user asks if anyone with sufficient computing resources can create a large distillation dataset of 70-1 million examples from GLM5.2. The goal is to enable better training of smaller models like Qwen3.5, benefiting the broader community.

github llama.cpp · 13d ago

llama.cpp Release b9693 Adds BF16 Support and Cross-Platform Binaries

llama.cpp version b9693 introduces BF16 support in its concat kernel and provides pre-built binaries for macOS, Linux, Android, Windows, and openEuler. The release includes CPU, Vulkan, ROCm, OpenVINO, SYCL, and HIP variants across multiple architectures, with a dedicated UI package available.

github llama.cpp · 13d ago

llama.cpp releases version b9694 with new binaries for multiple platforms

llama.cpp has released version b9694, including binaries for macOS, Linux, Android, Windows, and openEuler. The release supports various architectures and acceleration options such as CUDA, Vulkan, OpenVINO, SYCL, and ROCm. A fix for the Windows x64 OpenVINO release link was also implemented.

media r/LocalLLaMA · 13d ago

LocalLLaMA proposes crowdsourced coding dataset

A community initiative suggests creating a crowdsourced coding dataset to enable local LLM development. The proposal aims to allow anyone with hardware to contribute data, with more powerful users helping to fine-tune or quantize models, thus reducing reliance on company-released models.

media r/LocalLLaMA · 14d ago

What have you been working on lately?

A Reddit user asks the community about their recent projects, noting that while discussions focus on tools, there is little insight into the actual applications or work being done with those tools.

media r/LocalLLaMA · 14d ago

GLM-5.2 Review and Censorship Response

GLM-5.2 demonstrates exceptional long-context coherence and conversational fluency, outperforming Gemini-3.1-Pro on text-only tasks and matching GPT-5.5 in reasoning quality. The model responds factually to sensitive topics like Taiwan and Tiananmen Square, providing detailed historical context without overt censorship, though it adheres to Chinese government content guidelines.

media Latent Space · 14d ago

Midjourney Launches Full-Body Ultrasound CT Scanner

Midjourney has announced a full-body ultrasound CT scanner, calling it the first new whole-body medical imaging modality in 50 years. The prototype, known as the Midjourney Scanner, uses 8,960 transducers across 40 systems in a 70 cm ring to capture data at 17 GB/s, with claimed resolution down to 0.5 mm and a goal of 358,000 ultrasonic elements. The system is currently in Gen 1, with scans taking 20 minutes and no AI used in image generation yet, though future versions aim to integrate AI and reach 50,000 scanners by enabling 1 billion scans monthly.

media r/LocalLLaMA · 14d ago

Price rising effect is wild

A Reddit post discusses the potential release of Q.01, noting that precision is no longer a priority. The post highlights a phenomenon referred to as the 'price rising effect' as being significant and unexpected.

arxiv arXiv cs.LG · 14d ago

Discriminator-Guided RL Corrects Flow Matching with Data-Aligned Rewards

Discriminator-Guided RL (DRL) uses a pretrained representation space to train a discriminator that separates real data from model-generated samples. Its logit is used as a reward in KL-regularized RL, aligning model outputs with visual and semantic realism without human preferences. DRL improves FID and semantic FD across models like SiT and JiT, and enhances the Pareto frontier between preference and fidelity.

arxiv arXiv cs.LG · 14d ago

Essential Subspace Merging for Multi-Task Learning

Essential Subspace Merging (ESM) reduces inter-task interference by focusing on principal directions of activation shifts. ESM++ extends this with dynamic expert selection via prototype-based routing, enabling efficient, training-free multi-task model merging.

arxiv arXiv cs.LG · 14d ago

Safety Reflection Pretraining for LLMs

Safety Reflection Pretraining inserts short safety reflections into pretraining data to enable self-monitoring in language models. Experiments with 1.7B models on FineWeb-Edu show improved safety accuracy and reduced attack success rates, with MedSafetyWorld demonstrating that the method better prevents unsafe behaviors from being generalized from safe data than data filtering or rewriting.