All articles — korshunov.ai

All articles Page 1 / 129

Cloudflare Launches Temporary Accounts for AI Agents

Cloudflare now allows users to deploy Workers applications without a permanent account using the command npx wrangler deploy --temporary. Each deployment runs in an ephemeral project that stays live for 60 minutes, with a claim link expiring in under an hour if ownership is not claimed.

blog Simon Willison · 9d ago

sqlite-utils 4.0rc1 Release

sqlite-utils 4.0rc1 introduces migration support and nested transactions. The release is documented on Simon Willison's blog.

blog Simon Willison · 9d ago

sqlite-utils 4.0rc1 Adds Migrations and Nested Transactions

sqlite-utils 4.0rc1 introduces database migrations and db.atomic() for nested transactions. Migrations support script-based schema changes using a simplified API, while db.atomic() enables nested transactions via savepoints, improving error handling and data integrity. The release includes backwards-incompatible changes, such as updated upsert behavior and dropped Python 3.8 support, with options to maintain older behaviors.

media r/LocalLLaMA · 9d ago

Qwen 27B for planning, Qwen 35B-A3B for execution

A user explores using Qwen 27B for long-horizon task planning and Qwen 35B-A3B for rapid execution, noting the 27B runs at 7-10 tokens per second and the 35B-A3B at ~18 tokens per second. The user considers switching between models to leverage their different strengths, though currently uses the 35B-A3B exclusively and questions whether the intelligence gap between models is significant.

github llama.cpp · 9d ago

llama.cpp release b9750: new call statement and cross-platform binaries

llama.cpp version b9750 introduces a call statement implementation and rolls back an unintended change. The release includes precompiled binaries for macOS, Linux, Android, Windows, and openEuler across multiple architectures and hardware acceleration options, including Vulkan, CUDA, OpenVINO, and SYCL.

media r/LocalLLaMA · 9d ago

Updated Vision Model Benchmark Results and Recommendations

A revised benchmark of local vision language models evaluates 23 models across 30 images with 3 tests each, totaling 2,070 tests and 60 to 70 inference hours. The top-performing model is Qwen3.6 27B (nothink) at Q4 with a 79.6 score, followed by Qwen3.5 4B (nothink) at Q4, and Qwen3-VL 8B at Q8. Key findings include thinking mode degrading vision performance, MoE models underperforming compared to dense models, and Q8 quantization not universally improving results.

media r/LocalLLaMA · 9d ago

Qwen 3.6 27B Apostate Released with Safety Removed

The Qwen 3.6 27B model has been modified using Apostate to remove safety alignment, reducing its refusal rate from 92% to 7.6%. This change results in minimal impact on the model's capabilities, with a KL divergence of 0.120.

media r/LocalLLaMA · 9d ago

I forked ik_llama.cpp and added --numa mirror mode

A new fork of ik_llama.cpp adds a --numa mirror mode that duplicates model weights and KV cache across CPU sockets, enabling full utilization of multi-socket systems. This reduces remote memory access penalties and improves inference throughput by up to 1.6x on tested models, though it requires twice the RAM.

github llama.cpp · 10d ago

llama.cpp releases version b9748 with new binaries and features

llama.cpp releases version b9748, adding a "verbose" field to its schema and providing binaries for macOS, Linux, Android, Windows, and openEuler. The release includes CPU, Vulkan, OpenVINO, SYCL, and ROCm support across multiple architectures, with iOS and Windows CUDA and Vulkan builds available.

media r/LocalLLaMA · 10d ago

I pretrained and post trained a 500M parameter LLM and 330M parameter Image generator from scratch

The author pretrained a 500M parameter language model and a 330M parameter image generator from scratch using 40B tokens from fineweb. The image generator was inspired by ByteDance's DreamLite architecture and trained on a mixture of datasets from MidJourney, Flux, and CCW3.

media r/LocalLLaMA · 10d ago

What's your local Haiku replacement?

A user is seeking a reliable and fast local alternative to Haiku for summarizing technical content such as code documentation and architectural descriptions. They ask for suggestions on suitable tools or models in this space.

media TLDR AI · 10d ago

GPT-5.6, Claude Code Artifacts, Perplexity's Brain Memory Released

OpenAI announced GPT-5.6, a new language model update. Anthropic released Claude Code artifacts, enhancing code generation capabilities. Perplexity introduced Brain memory, enabling contextual recall in search responses.

media Hugging Face Forums · 10d ago

Request to force-delete stuck Hugging Face Space

User requests force-deletion of Hugging Face Space "kayinda/rxsteward" stuck in "Building" state. All deletion attempts fail with 403 errors or 400 invalid input errors, preventing reuse of the name.

media AI News (smol.ai) · 10d ago

GLM-5.2 Breakout and Open-Model Progress Highlighted

Zhipu's GLM-5.2 emerged as the top open-weight model, praised for its frontier-adjacent performance in daily use, with improvements in coding tasks and reduced 1M-token inference cost via IndexShare. It outperformed other open models in agentic knowledge work benchmarks, reaching 1266 Elo in Artificial Analysis' AA-Briefcase test, though only 3% of tasks were fully satisfied by top models, indicating persistent challenges in real-world long-horizon agent performance.

lab NVIDIA Technical Blog · 10d ago

Build Your Own Transaction Foundation Model for Financial Intelligence

Transaction data captures rich human behavior patterns and is a key asset for enterprises. Current use cases often rely on brittle, manually engineered features that fail to capture sequential customer behavior in transaction histories.

lab Hugging Face Blog · 10d ago

Can You Beat LoRA in Fine-Tuning?

A new study explores alternatives to LoRA, the most popular fine-tuning technique, assessing whether other methods can achieve better performance with less computational cost. The research finds that while some approaches show promise, none consistently outperform LoRA across diverse tasks and datasets.

lab Google DeepMind Blog · 10d ago

AI Control Roadmap for Internal System Security

An AI Control Roadmap has been introduced to secure internal systems by integrating traditional safeguards with real-time monitoring capabilities.

lab OpenAI News · 10d ago

GPT-5.5 Instant Enhances ChatGPT's Health Responses

GPT-5.5 Instant improves ChatGPT's health and wellness responses through stronger reasoning, better context handling, clearer communication, and physician-informed evaluations.

media Hugging Face Forums · 10d ago

Important finding for everyone stuck on 'Starting' status!

The Hugging Face UI incorrectly shows spaces stuck at 'Starting', while backend operations succeed. Checking container logs reveals successful initialization, indicating a frontend sync glitch. Users should not modify their code; the issue is a platform-side UI bug.

lab Google — The Keyword (AI) · 10d ago

New research shows AMIE matches doctors in disease management

A study published in Nature reveals that AMIE, a conversational AI system, performs as well as primary care physicians in managing complex health conditions.