All articles — korshunov.ai — ML news

All articles

Recent 1744 →

github just now

llama.cpp b9785 Release with Hardened Caps Check and Multi-Platform Binaries

Prompt Format Inquiry for Training Unsloth/Phi-3.5-mini-instruct

Space-Efficient Language Generation in the Limit

Argus Benchmark Evaluates Uncertainty Quantification Stability Across Vision-Language Models and GUI Grounding Datasets

SARA: Unlocking Multilingual Knowledge in Mixture-of-Experts via Semantically Anchored Routing Alignment

+1739 more

Inference efficiency 173 →

github just now

llama.cpp b9785 Release with Hardened Caps Check and Multi-Platform Binaries

Gemma4-26B-A4B & 31B-QAT Uncensored Balanced Released with MTP Speed Boosts

GLM-5.2 on 4x DGX Spark: Reconstructing Missing Build Steps for MTP Speculative Decode

+170 more

Training methods 219 →

Space-Efficient Language Generation in the Limit

SARA: Unlocking Multilingual Knowledge in Mixture-of-Experts via Semantically Anchored Routing Alignment

Gefen: A Drop-in Replacement for AdamW with Claimed 8x Memory Reduction

+216 more

Research paper 270 →

Space-Efficient Language Generation in the Limit

SARA: Unlocking Multilingual Knowledge in Mixture-of-Experts via Semantically Anchored Routing Alignment

Colony: An Educational Simulation of LLM Attention Mechanisms Using Agent-Based Analogies

+267 more

Retrieval & RAG 38 →

How Large Language Models Source Brand Reputation Across Languages and Markets

Ontological Inversion: Flipping LLM Emotional Concepts via Negative Gain

Hi-Seg: Human-AI Collaboration for Pulmonary Nodule Segmentation

+35 more

AI agents 378 →

ToolBench-X: Benchmarking Tool-Using Agents Under Unreliable Environments

Colony: An Educational Simulation of LLM Attention Mechanisms Using Agent-Based Analogies

Claude Code v2.1.191 Release Notes

+375 more

Evaluation & benchmarks 817 →

ToolBench-X: Benchmarking Tool-Using Agents Under Unreliable Environments

How Large Language Models Source Brand Reputation Across Languages and Markets

Community Inquiry on Model Benchmarking Methods

+814 more

Open weights 207 →

Gemma4-26B-A4B & 31B-QAT Uncensored Balanced Released with MTP Speed Boosts

GLM-5.2 on 4x DGX Spark: Reconstructing Missing Build Steps for MTP Speculative Decode

SDXL Running Locally in Browser on WebGPU, Open-Source

+204 more

API & product launches 53 →

Simon Willison converts MDN browser compatibility data into a SQLite database

User Reports Step 3.7 Flash Model Tool Access Failure on HuggingChat

Claude Code v2.1.191 Release Notes

+50 more

Hardware & chips 34 →

Query on Clustering Nvidia DGX Spark and AMD Ryzen AI Max 395 for Unified Memory Inference

MINISFORUM DEG1 Oculink eGPU Dock Refurbished Available for $59

7 Chinese companies shipping H100/H200-class AI chips, most IPO'd in last 6 months

+31 more

Safety & alignment 191 →

User Observes Cloud Chatbots Appear Less Intelligent Than Local Models

Swiss Federal Supreme Court evaluates Heretic for internal use

Governance Decay in Long-Horizon LLM Agents

+188 more

Image generation 51 →

SDXL Running Locally in Browser on WebGPU, Open-Source

Atomistic Language Models Understand and Generate Materials

Unlimited-OCR is now on ModelScope

+48 more

Code generation 270 →

I reverse engineered Windows Copilot into a free OpenAI-compatible API

Gemini 3.5 Flash Adds Computer Use Capability

Has anyone else found vLLM outputs worse than llama.cpp?

+267 more

Voice & audio 37 →

Introducing the FFASR Leaderboard: Benchmarking ASR in the Real World

CN-NewsTTS Bench v0.1 Released

Poster: Exploring Audio-Based Scam Detection in Turkish

+34 more

Multimodal 152 →

Efficient Multimodal Models for Pulmonary Embolism Risk Assessment

MMGist: A Comprehensive Multimodal Benchmark for 2027

Deep Learning Pipeline for Sign Language Recognition and Translation to Indian Vernaculars

+149 more

Reasoning models 684 →

LLMs Use Difference-Making Logic to Learn Causal Structure

Gazer: Training-Free Semantic Correction for Autoregressive Visual Models

Multimodal Chain-of-Thought: Capabilities and Limitations

+681 more

Robotics 16 →

Reward-Petri-Net Interpretation of Temporal Behavior Trees

A Taxonomy of Conceptual Alignment in Human-Robot Dialogue

NVIDIA Launches Halos for Robotics: Full-Stack Functional Safety System

+13 more

Policy & regulation 20 →

Bill to Mandate AI Chip Location Tracking Gains Industry Support

Machine Whistleblowing: A Normative and Principled Approach

OpenAI Builds Shared AI Standards via Appia Foundation

+17 more

Training data 69 →

UD_Czech-PDTC: A Large and Genre-Rich Treebank in Universal Dependencies

Koshur Pixel: First Large-Scale Synthetic OCR Dataset for Kashmiri

Enable Real-Time AI for High-Speed Data Acquisition with DAQIRI

+66 more

Benchmark results 26 →

GLM-5.2 Is the New Best Open Model

GLM-5.2 is the new leading open weights model on the Artificial Analysis Intelligence Index

New Agentic Benchmark Released

+23 more

Video generation 1 →

Local LLM Agent Now Generates Images and Video Offline