API & product launches — korshunov.ai

API & product launches Page 1 / 3

User Reports Tool and MCP Server Unavailability for Step 3.7 Flash on HuggingChat

A user on the Hugging Face forums reported that the Step 3.7 Flash model lost the ability to use tools and connect to MCP servers starting that morning. The poster expressed strong satisfaction with the model's performance, noting its high quality relative to its low resource consumption and cost. They emphasized a desire to continue using this specific model rather than switching to alternatives due to its efficiency. The user explicitly asked whether this loss of functionality is permanent and if there are any steps they can take to restore access. The post highlights community concern regarding the sudden disruption of tooling capabilities for a popular, cost-effective model.

blog Simon Willison · 3h ago

Simon Willison converts MDN browser compatibility data into a SQLite database

Inspired by Mozilla's new MDN MCP service, Simon Willison has converted the comprehensive mdn/browser-compat-data repository into a SQLite database. The project utilizes a script generated by Claude Code for web (Opus 4.8) to perform this conversion using sqlite-utils. The resulting database is approximately 66MB in size and is hosted on GitHub with open CORS headers to facilitate direct access. To automate the process, a GitHub Actions workflow was built using Codex Desktop (GPT-5.5) to force-push the updated database to an orphan branch named db. Users can download the final browser-compat.db file directly from the repository or explore its contents via Datasette Lite.

media Hugging Face Forums · 4h ago

User Reports Step 3.7 Flash Model Tool Access Failure on HuggingChat

A user on the Hugging Face discussion forum reported that the Step 3.7 Flash model by StepFun AI has lost its ability to use tools, including MCP servers, as of the morning of the report. The individual expressed concern over whether this outage is temporary or permanent, noting their strong preference for this specific model due to its high performance and low resource costs compared to competitors. Despite praising the model's quality and affordability, the user highlighted the immediate disruption caused by the inability to execute tool-based functions. The post seeks clarification from the community regarding prior experiences with similar issues and potential resolutions. This incident underscores a critical dependency on tool availability for users relying on this specific AI configuration.

github LlamaIndex · 5h ago

Llama Index v0.14.23 Release Notes

Llama Index released version 0.14.23 on June 24, 2026, introducing significant multimodal capabilities and various bug fixes. The core update includes multimodal synthesis features and the introduction of multimodal query engines to support diverse data types. Key fixes address document and video block handling within FunctionTool outputs and ensure URL-backed memory blocks are preserved correctly. Performance improvements were implemented by using sets for within-batch deduplication in the ingestion pipeline and optimizing token text splitting logic. The release also resolves a ZeroDivisionError on empty input sequences and fixes recursion errors in splitters when units exceed chunk sizes. Additionally, explicit UTF-8 encoding was added to file I/O operations, and deep copying of initial states prevents mutation leaks across workflow runs.

lab Claude Code Releases · 5h ago

Claude Code v2.1.191 Release Notes

Claude Code version 2.1.191 introduces /rewind support, allowing users to resume conversations from before a /clear command was executed. The update fixes several critical issues, including background agents resurrecting after being stopped and scroll position jumping during streaming responses. It also corrects behavior where /voice displayed generic error messages and where /login URLs were truncated in Windows Terminal. Significant improvements enhance reliability for MCP servers by adding retry logic for transient network errors during capability discovery and OAuth flows. Headless environments now skip browser popups for OAuth, while sandbox network permissions are remembered for the session duration. Performance optimizations reduce CPU usage during streaming by approximately 37% through text update coalescing and mitigate long-session memory growth from the terminal output cache.

media r/LocalLLaMA · 9h ago

I reverse engineered Windows Copilot into a free OpenAI-compatible API

A user has created a local API that replicates OpenAI-compatible GPT-4 functionality using Microsoft's free Copilot service. The tool logs into a Microsoft account once, runs locally on a Windows device, and exposes a server at http://localhost:8000/v1 that supports streaming and multi-turn conversations without requiring an API key or billing. It is designed for personal and educational use, and available via GitHub at https://github.com/sums001/Windows-Copilot-API.

lab Mistral AI News · 13h ago

New Connector Controls for Enterprise Security and Access

Mistral Studio now offers enriched admin controls to govern connector access per workspace and tool, enabling fine-grained permissions. Features include API keys with scopes, multi-account connectors, and a new Connectors Debugger for root cause analysis, all supporting secure, auditable integration with enterprise systems.

media r/LocalLLaMA · 1d ago

Open Source Hugging Face Downloader App Released

The developer has released an open-source desktop app that downloads Hugging Face models, datasets, and spaces locally. The app automatically detects connection issues and resumes downloads, runs without cloud services or telemetry, and supports macOS, Windows, and Linux (both x64 and arm64).

media Hugging Face Forums · 1d ago

Llama 3.1 70B API Access Restricted to Hugging Face Tester

Users can access the Llama 3.1 70B model via the Hugging Face tester, but receive a "Model not supported by provider" error when using third-party apps or curl. The model is currently only available through the Hugging Face interface and not exposed via public API endpoints.

media Hugging Face Forums · 1d ago

Spaces tokens no longer work and files not saved

After a recent Hugging Face update, Spaces tokens stopped working, resulting in 404 errors when attempting to save generated files. The process completes successfully up to 100% but fails during the save phase due to token errors, consuming ZeroGPU credits without producing any saved outputs.

lab Hugging Face Blog · 2d ago

Shipping huggingface_hub weekly with AI, open tools, and human oversight

Hugging Face is releasing huggingface_hub weekly, integrating AI models, open-source tools, and a human review process to ensure quality and safety. The update emphasizes transparency, community involvement, and responsible AI development through continuous human-in-the-loop validation.

lab OpenAI News · 2d ago

Omio builds AI-native conversational travel

Omio leverages OpenAI to enhance conversational travel experiences. The company uses AI to accelerate product development and transition into an AI-native business model.

lab OpenAI News · 4d ago

OpenAI launches spend controls and usage analytics for ChatGPT Enterprise

OpenAI has introduced new spend controls and usage analytics for ChatGPT Enterprise. These features help enterprises manage costs and make informed decisions as they scale AI usage.

media r/LocalLLaMA · 4d ago

Kimi AI Just Mailed Me

User reports receiving an email from Kimi.ai related to one of their YouTube videos. The message was shared on Reddit within the LocalLLaMA community.

media r/LocalLLaMA · 4d ago

What are people doing with their local models and what tools do they use?

A user asks about practical applications of local models and which tools are effective for tasks like coding, particularly as alternatives to web-based interfaces like Claude.ai. They mention trying OpenWebUI but find it underpowered without significant customization.

media r/LocalLLaMA · 4d ago

Introducing Noema Atlas: Peer-to-Peer Model Distribution

Noema Atlas is a free, open-source peer-to-peer network that enables decentralized distribution of local LLM models using Iroh and BLAKE3 hashing. It allows users to share and retrieve models directly from peers worldwide, with Hugging Face and mirrors as fallbacks, and supports rescuing models removed from Hugging Face through private sharing.

github llama.cpp · 4d ago

llama.cpp Release b9741 Adds New Binaries and Support

llama.cpp version b9741 introduces new binaries for macOS, Linux, Android, Windows, and openEuler across multiple architectures. The release includes support for Vulkan, CUDA 12.4 and 13.3, OpenVINO, SYCL, and ROCm, with updated versions for iOS and Ubuntu.

github llama.cpp · 5d ago

llama.cpp release b9738: fixes CORS auth header forwarding and new binary builds

llama.cpp version b9738 fixes the CORS proxy to avoid forwarding authentication headers. The release includes binary builds for macOS, Linux, Android, Windows, and openEuler across multiple architectures and hardware acceleration options, including Vulkan, CUDA, OpenVINO, and SYCL.

github llama.cpp · 5d ago

ggml optimizes AMX with partition flattening

The ggml project has optimized AMX performance by flattening the partition over n_batch * M, ensuring all threads participate in quantization. This change improves speed by up to 1.47x across various models and hardware configurations on CPU and GPU platforms, with results showing consistent gains in inference time.

media r/LocalLLaMA · 5d ago

Worlds Biggest Chat Title Dataset Released by SupraLabs

SupraLabs has released a curated chat title dataset with 115K samples, surpassing the previous record of 10K samples. The filtered dataset is available as `SupraLabs/chat-titles-filtered-115K`, while an unfiltered version with 150K samples is also provided, along with a legacy 12K dataset.