All articles — korshunov.ai

All articles Page 1 / 130

OpenAI Launches Daybreak Security Tools

OpenAI has introduced Codex Security and GPT-5.5-Cyber as part of its Daybreak suite. These tools aim to help organizations identify, validate, and patch vulnerabilities at scale.

lab OpenAI News · 9d ago

OpenAI Launches Patch the Planet for Open Source

OpenAI has launched Patch the Planet, a Daybreak initiative aimed at helping open-source maintainers identify, validate, and resolve vulnerabilities. The program combines AI tools with expert review to improve the security of open-source software.

media r/LocalLLaMA · 9d ago

Best local model for converting text to structured JSON output

Users are seeking a local model that efficiently converts unstructured text into valid JSON based on a defined schema. Among tested models, Qwen 3.6 35B a3b shows strong performance, matching the quality of larger models like GPT-120B while being more stable on local machines than GPT-20B.

github llama.cpp · 9d ago

llama.cpp release b9760: new input schema and cross-platform binaries

llama.cpp version b9760 introduces a refactored input file schema that supports raw base64 input videos. The release includes binaries for macOS, Linux, Android, Windows, and openEuler across multiple architectures and hardware accelerators, including Vulkan, CUDA, OpenVINO, and SYCL.

lab NVIDIA Technical Blog · 9d ago

CCCL Runtime: A Modern C++ Runtime for CUDA

NVIDIA has released the CCCL Runtime, a modern C++ runtime that provides safer and more convenient abstractions for CUDA programming. It introduces updated C++ features to simplify and enhance CUDA C++ development.

media r/LocalLLaMA · 9d ago

Moebius: 0.2B Lightweight Image Inpainting Framework

Moebius is a 0.2B-parameter image inpainting framework that achieves performance comparable to 10B-parameter models. It is designed for lightweight, efficient image editing with minimal computational requirements.

media r/LocalLLaMA · 9d ago

Chinese Hackers Create Tesla V100 v4 Clone

Chinese hackers have reverse-engineered the Tesla V100's pinout, soldered it onto a half-height PCB, and released it as the Tesla V100 v4. The 16GB version is priced at 1499 RMB (220 USD) with a three-year warranty, while the 32GB version costs 3999 RMB (590 USD).

media r/LocalLLaMA · 9d ago

TMax: A Simple Recipe for Terminal Agents

TMax introduces TMax-15k, a dataset of 14,600 RL environments, over 2.5× larger than the next-largest open terminal dataset. It also presents a simple RL recipe that trains open models from 2B to 27B parameters, with TMax-9B achieving 27.2% on Terminal Bench 2.0 and TMax-27B reaching 42.7%.

media r/LocalLLaMA · 9d ago

NEX-N2-mini claims Pareto optimality in reasoning efficiency

The NEX-N2-mini model asserts it achieves 3.5 and 3.6 level reasoning performance with significantly fewer reasoning tokens. Testing shows it outperforms other MoE models in efficiency, reducing wasted tokens while maintaining high reasoning quality.

media r/LocalLLaMA · 9d ago

Gemma4-12B-QAT Uncensored Balanced Released with 60% Speed Boost via MTP

The Gemma4-12B-QAT Uncensored Balanced model is now available, featuring a 60% speed improvement through multi-token-prediction (MTP) speculative decoding. It includes Q4_K_M quantization, vision support via mmproj, and stable generation with no looping or context drift, making it ideal for creative writing and emotional intelligence tasks.

media r/LocalLLaMA · 9d ago

Same model, same prompt, 4 different agents produce varied code quality

A self-hosted Qwen3.6-27B model with identical prompt and hardware generated four different HTML/JavaScript solar system simulations. The agent scaffolding significantly influenced output: opencode produced clean, stable code with accurate physics; pi showed robustness and coordinate consistency; hermes offered visually appealing but physically flawed results; qwen code generated minimal, crude code. The results highlight how agent design shapes code quality, correctness, and stability despite shared model and prompt.

media Interconnects · 9d ago

GLM-5.2 is the step change for open agents

GLM-5.2, an open-weight AI model released by Z.ai, has set a new benchmark in coding and general agent performance. It outperforms models like Claude Fable 5 and Gemini, and matches or exceeds OpenAI's Opus 4.8 in max thinking mode, establishing itself as the first open model that feels right in coding harnesses as a general agent.

lab NVIDIA Technical Blog · 9d ago

Enable Real-Time AI for High-Speed Data Acquisition with DAQIRI

AlphaFold2's 2020 success relied on 170,000 protein structures from the Protein Data Bank. Nvidia's DAQIRI enables real-time AI processing for high-speed data acquisition by analyzing data as it is generated.

media r/LocalLLaMA · 9d ago

GLM-5.2 UD-IQ1_M Speed Test on llama.cpp with 5090 and 3090 Ti

A speed test of GLM-5.2 quantized to UD-IQ1_M using llama.cpp shows 579 t/s prefill at 8k context and 324 t/s at 57k context. Decode speed remains steady at 10.6 t/s for over 580 tokens, dropping to 9.37 t/s at 60k context.

media r/LocalLLaMA · 9d ago

Which chassis are ya using?

A user asks for recommendations on multi-GPU cases, specifically mentioning a 6 GPU tower dual chamber model available on Alibaba. They seek feedback on this option and its suitability for high-end GPU setups.

lab NVIDIA Technical Blog · 9d ago

NVIDIA Launches Halos for Robotics: Full-Stack Functional Safety System

NVIDIA has introduced Halos for Robotics, a full-stack functional safety system designed for physical AI. It enables AI-driven safety in unstructured environments where robots operate autonomously alongside humans in factories, warehouses, hospitals, and homes.

media Import AI · 9d ago

AI Out-Persuades Humans: New Study Shows AI Superior to Experts

A study by Oxford, Stanford, and LSE researchers finds AI systems consistently out-persuade expert humans across four experiments involving 18,978 conversations. AI exceeded professional canvassers by 10.8 percentage points in real-world donations to Save the Children, with Opus 4.1 and Opus 4.6 showing the strongest persuasion performance.

lab Hugging Face Blog · 9d ago

PP-OCRv6 Released on Hugging Face with 50-Language Support

PP-OCRv6, a new optical character recognition model, is now available on Hugging Face. It supports 50 languages and scales from 1.5 million to 34.5 million parameters, offering improved accuracy and efficiency across diverse languages.

media r/LocalLLaMA · 9d ago

I Built a Tool to Stop Manually Swapping Models on My 8GB GPU

I developed Prompt-Chain, a Streamlit app that chains a small Prompter model with a large Coder model into a single pipeline. It automatically swaps VRAM when transitioning from prompt refinement to code generation, eliminating manual model switching and reducing wasted tokens from poorly worded prompts.

media r/LocalLLaMA · 9d ago

GLM5.2 runs at 7tg on 4x 3090s with 192GB DDR5 on budget build

A user shares their home lab setup with four GeForce 3090 GPUs and 192GB of DDR5 RAM overclocked to 5600 MHz. They run GLM5.2 at 7 tera-giga (tg) as a planner, MiniMax 2.7 at 45tg in VRAM for coding, and Qwen3.6 27B at q8 for testing, all on consumer-grade hardware due to cost considerations.