AI agents — korshunov.ai

AI agents Page 7 / 20

semantic-memory: local-first knowledge base with typed graph edges

semantic-memory is a local-first knowledge base in Rust that combines BM25, vector, and reciprocal rank fusion search with SQLite. It features typed graph edges for causal, temporal, and semantic relationships, provenance tracking, bitemporal storage, and adaptive query routing, supporting 18 MCP tools for AI agents. All components run locally without cloud dependencies, API keys, or telemetry.

media r/LocalLLaMA · 5d ago

Struggling to finish Xiaomi Mimo-v2.5-pro token plan credits before expiry

A user has 24B token credits from a Xiaomi token plan competition, worth $50 but obtained for free. They report heavy token consumption during use, limited tool support, and are now concerned about wasting credits due to expiration in four days. The model is praised for its 90% cache hit rate and 99% price reduction on cache hits, with the user noting it performs well in coding and planning tasks.

media r/LocalLLaMA · 5d ago

Board where every tile is an agent

A project called Jaz introduces a board where each tile functions as an independent agent responsible for maintaining its own state. The system is open source and available on GitHub, with a live demo at jaz.chat, requiring a coding agent like Claude Code or Codex to operate.

media r/LocalLLaMA · 5d ago

Deep Neural Network Turns Images into Playable Games Locally

A locally running deep neural network can turn any image into a playable game, using a small Transformer-like model trained from scratch. The model, running on an RTX 5090, generates game sequences autoregressively with real-time keyboard input, though it currently suffers from poor motion and context issues.

media r/LocalLLaMA · 5d ago

Two Word Docs Chatting via Local LLMs — Real Use Cases?

A prototype demonstrates two Word documents exchanging content using local LLMs, with iterative back-and-forth over multiple turns. Potential practical use cases include a draft document and critic document iterating together, or a specification document and implementation document collaborating, though the viability of such workflows remains uncertain.

media r/LocalLLaMA · 5d ago

Research Project: Injecting Natural-Language Tactical Intent into Multi-Agent Football Policies

A research project explores using natural-language tactical instructions from humans to guide autonomous AI agents in a football simulation. The system enables human coaches to issue high-level directives like 'press aggressively' or 'exploit the left side', which the AI agents then adapt to in real time within a dynamic, team-based environment.

media r/LocalLLaMA · 5d ago

Local AI for Local Office Files

A Reddit user asks which AI agent is best for handling local office files like Excel, PDF, Word, and JSON. The post seeks user experiences and implemented workflows for such tasks.

media r/LocalLLaMA · 5d ago

Tool calling issue in open-source Qwen3.6 27B 8K

Users report that the Qwen3.6 27B 8K model occasionally stops processing after generating a tool call, especially when the user steps away. The issue can be resolved by manually pasting the tool call back into the prompt, allowing the model to resume execution. The tool call involves a bash function to find passing tests in a codebase.

media r/LocalLLaMA · 5d ago

Local Agent Web Access via SearXNG and Scrapling

A local agent can access the web without paid APIs by using self-hosted SearXNG for search and Scrapling with Trafilatura for page extraction. The setup avoids vendor dependencies, uses open-source tools, and delivers search results and page content in Markdown format, with fallbacks for CAPTCHAs and security challenges.

media r/LocalLLaMA · 5d ago

Local agent on 4090 - looking for LM Studio settings

A user reports slow token generation when running a local agent on a 4090 with 24GB VRAM, despite adjusting context and batching settings. They note Gemma4 performs faster but produces incorrect tokens like <code></tool_call></code>, and seek recommended settings and explanations for parameters such as top_p and top_k.

blog Simon Willison · 5d ago

Sean Lynch on MCP's Auth Flow Isolation

Sean Lynch highlights that the Model Context Protocol (MCP) offers a key advantage by isolating authentication flows outside the agent's context window. He suggests the ideal form of MCP could be a simple auth gateway for APIs, which would still represent a significant improvement.

media r/LocalLLaMA · 6d ago

Best Local Agents - Jun 2026

A discussion thread identifies the best local AI agents available today, emphasizing open-weight models and local hardware execution. The post defines 'agents' as autonomous software that self-determines actions without pre-programming, distinguishing them from tools like IFTTT or Apple Shortcuts, and sets rules requiring local deployment and open-source agent software as a primary focus.

media r/LocalLLaMA · 6d ago

Help Running Local Hermes Agent with llama-cpp

A user reports issues running a local Hermes AI agent on a high-end rig using self-compiled llama-cpp. The setup experiences frequent KV cache reprocessing every 5 messages and slow reasoning, with the agent repeatedly pausing to report progress instead of continuing autonomously. The user seeks guidance on whether their llama-cpp parameters are incorrect or what adjustments can improve agent performance and sustained reasoning without interruptions.

media r/LocalLLaMA · 6d ago

Improving local models with an API-based consultant agent

A user asks whether adding a powerful API-based 'consultant' agent, such as GLM 5.2, could enhance local AI workflows by refining plans and learning processes. The post explores the potential benefits of such an agent in improving local model performance through external consultation.

media r/LocalLLaMA · 6d ago

Best Harness for Web Searching

Users report that tools like LM Studio and Odysseus are limited by search engine request caps, often at 10 per day or hour, without API access. They suggest creating DuckDuckGo API accounts for better search access, but note that frontends rarely prompt for this. The post asks whether Hermes or Pi offer improved solutions.

media r/LocalLLaMA · 6d ago

Watching a Local AI Voice Assistant Get Dumber

A test on an RTX 5060 Ti showed that reducing a local AI voice assistant's model size from 9B to 0.8B leads to a sharp decline in capability. The 9B model handles tool orchestration well, while smaller models show increasing failures: the 4B model skips tool calls and guesses facts, the 2B model suffers semantic drift, and the 0.8B model fails to operate agent functions, triggering wrong APIs or infinite loops.

media r/LocalLLaMA · 6d ago

New Agentic Benchmark Released

Artificial Analysis has introduced a new agentic benchmark that evaluates large language models' ability to plan and execute tasks. Claude Fable and GLM 5.2 achieved top positions within their respective cohorts, demonstrating strong performance on this unsaturated benchmark.

media r/LocalLLaMA · 6d ago

Multi Doc Agent Workflows in Word

A blog post details how to implement multi-document agent workflows in Microsoft Word using local LLMs. The guide outlines steps to enable agents to process and interact with multiple documents within a single Word environment.

media r/LocalLLaMA · 6d ago

Ohio State University releases open-source Deep Research agent QUEST-35B

Ohio State University's NLP team has released QUEST-35B, an open-source Deep Research agent trained on approximately 32 H100 GPUs using 8,000 synthetic samples. The team open-sourced the training recipe, code, weights, and datasets, with benchmark results showing competitive performance compared to leading closed-source Deep Research systems.

media r/LocalLLaMA · 6d ago

Ohio State University releases open-source Deep Research agent QUEST-35B

Researchers at Ohio State University trained QUEST-35B, a Deep Research agent, using approximately 32 H100 GPUs and 8,000 synthetic samples. They open-sourced the training recipe, code, weights, and datasets, with benchmark results showing competitive performance compared to leading closed-source Deep Research systems.