Code generation — korshunov.ai

Topic · Code generation

Introducing Claude Tag for Slack Teams

Claude Tag allows teams to tag @Claude in Slack to delegate tasks, with access to selected channels, tools, and codebases. It learns from channel context, works asynchronously, and takes initiative by proactively updating users on relevant information. Today, 65% of Anthropic’s product team code is created by internal Claude Tag, and it’s now available in beta for Claude Enterprise and Team customers.

lab Mistral AI News · 2d ago

Mistral Releases OCR 4 with Multilingual Support and Structured Output

Mistral OCR 4 introduces bounding boxes, block classification, and inline confidence scores for 170 languages across 10 language groups. It outperforms leading OCR systems in human preference evaluations with a 72% win rate and achieves the top score on OlmOCRBench (85.20), while offering self-hosted deployment in a single container and supporting enterprise use cases like RAG and document ingestion.

lab Google DeepMind Blog · 8h ago

Gemini 3.5 Flash Adds Computer Use Capability

Google has introduced computer use in Gemini 3.5 Flash, enabling the model to execute code and interact with external tools. This feature allows users to run programming tasks and access real-time information through integrated computing functions.

lab OpenAI News · 2d ago

OpenAI Launches Daybreak Security Tools

OpenAI has introduced Codex Security and GPT-5.5-Cyber as part of its Daybreak suite. These tools aim to help organizations identify, validate, and patch vulnerabilities at scale.

lab Claude Code Releases · 1d ago

Claude v2.1.187 Release Notes

Claude v2.1.187 introduces sandbox credentials blocking, org-configured model restrictions, mouse click support in fullscreen, and fixes for command failures, tool hangs, and UI stability. Updates also improve structured output handling, agent depth tracking, and plugin management, with enhancements to VSCode and terminal compatibility.

lab Claude Code Releases · 2d ago

Claude v2.1.186 Release Notes

Claude v2.1.186 adds CLI authentication commands for MCP servers, status filtering in workflows, and a "Skills" section in plugin settings. It includes numerous bug fixes for UI, session management, and agent behavior, along with improvements to YAML parsing, memory handling, and tool validation.

lab OpenAI News · 2d ago

Jason Liu Uses Codex for Long-Running Project Management

Jason Liu demonstrates how Codex helps preserve context and manage complex projects, enabling work to continue seamlessly beyond a single prompt.

lab OpenAI News · 3d ago

Samsung Deploys ChatGPT and Codex for Employees

Samsung Electronics has rolled out OpenAI's ChatGPT Enterprise and Codex to its global workforce. This deployment represents one of OpenAI's largest enterprise AI initiatives to date.

lab OpenAI News · 3d ago

GPT-5.5 Instant Enhances ChatGPT's Health Responses

GPT-5.5 Instant improves ChatGPT's health and wellness responses through stronger reasoning, better context handling, clearer communication, and physician-informed evaluations.

lab Claude Code Releases · 6d ago

v2.1.183 Release Notes

v2.1.183 improves auto mode safety by blocking destructive git and destroy commands without explicit user consent. It adds deprecation warnings for models, introduces attribution.sessionUrl to hide session links, and fixes multiple issues including terminal behavior, subagent performance, and input handling in web and tmux environments.

github AutoGPT · 6d ago

autogpt-platform-beta-v0.6.64 Released

The autogpt-platform-beta-v0.6.64 release, dated 18th June 2026, introduces new features such as the AutoPilot Context Panel and Global Search, along with enhancements to graph saving, caching, and builder performance. It also includes security hardening, bug fixes for LLM provider issues, and UI improvements like a high-resolution touch icon.

lab Claude Code Releases · 7d ago

Claude Code v2.1.181 Release Notes

Claude Code v2.1.181 introduces support for setting config settings via prompt syntax like /config thinking=false, adds sandbox Apple Events support on macOS, and improves streaming, auto-retry, and subagent behavior. It also fixes numerous bugs related to startup, file handling, clipboard, and UI responsiveness across platforms.

lab Claude Code Releases · 8d ago

Claude v2.1.178 Release Notes

Claude v2.1.178 introduces new permission rules using Tool(param:value) syntax, improved workflow and skill loading in nested directories, and enhanced auto mode and error messaging. It fixes critical issues including crashes, authentication errors, and UI behavior in Chrome and VSCode, while refining tool prompts and undo functionality.

github llama.cpp · 16h ago

vulkan-shaders-gen now fails build on shader compilation errors

The vulkan-shaders-gen tool now detects and fails the build when shader compilation fails, preventing the creation of a broken libggml-vulkan. This fix addresses a prior issue where build success masked runtime failures, and includes improvements to error handling and atomic flag management across platforms.

github OpenAI Agents SDK · 21h ago

Release of openai-agents-python v0.17.7

Version 0.17.7 of the openai-agents-python library includes new features such as configurable WebSocket max size and buffered Chat Completions tool-call streaming. It also contains multiple fixes for issues including sandbox buffering, error handling, and tool dispatch, along with documentation updates and improved error messaging.

github CrewAI · 1d ago

CrewAI 1.14.8a3 Release Notes

CrewAI 1.14.8a3 introduces unified declarative flow loading and improved startup UX for crew runs. It consolidates crewai run and flow kickoff commands, adds declarative Flow CLI support, and enables @router() as a flow start method with typed output schemas for tools.

media Together AI Blog · 1d ago

Frontier LLMs Struggle to Write Fast Multi-GPU Kernels

ParallelKernelBench evaluates LLMs on writing fast multi-GPU CUDA kernels for 87 real workloads. The top model generates kernels that perform under a third of the speed of optimal implementations, though a few outputs surpass any existing public code.

media Hugging Face Forums · 2d ago

Buddy System: Rust entropy monitor with NER-gated uncertainty for tiered LLM inference

The Buddy System uses a Rust entropy monitor to detect per-token uncertainty in local Gemma 3 4B inference, routing only uncertain tokens to Sonnet via NER-gated span extraction and semantic retrieval. Benchmarks show it achieves 71.4% accuracy at $0.21, outperforming the Anthropic Advisor pattern (62.9% at $0.44) across seven Hugging Face datasets, with a key improvement on SQuAD v2 by routing source passage chunks to the cloud model.

arxiv arXiv cs.CL · 2d ago

Latent Personal Memory: Dynamic Soft Prompts for LLM Personalization

Latent Personal Memory (LPM) represents user-specific memories as a compact, persistent matrix of N latent slots. These slots are mapped via a shared cross-attention network into dynamic, input-conditioned soft prompts that are prepended to a frozen LLM. LPM outperforms LoRA and Prompt Tuning by up to 8.8% and 54.4% on PersonaMem v1, reduces KV-cache usage by over 64x, matches LoRA accuracy on LoCoMo with 120x fewer parameters, and scales efficiently with context length, outperforming full-context at 128K tokens.

media MarkTechPost · 2d ago

Sakana AI Launches Sakana Fugu: Multi-Agent Orchestration Model

Sakana AI has launched Sakana Fugu, an orchestration model that routes tasks across a swappable pool of frontier LLMs via a single OpenAI-compatible API. Fugu Ultra outperforms individual models on key benchmarks like SWE Bench Pro and GPQA-D, and the system demonstrates superior performance on complex, multi-step tasks such as auto-research, Rubik's Cube solving, and blindfold chess.