All articles — korshunov.ai

All articles Page 1 / 117

Kimi and GLM on frontier code

This Reddit post by user Charuru shares an image titled "Kimi and GLM on frontier code." The content serves as a visual reference or discussion starter regarding the performance of Kimi and GLM models in coding tasks.

media Hugging Face Forums · 4h ago

Ainara: Local-first AI assistant with persistent memory and LLM switching

Ainara is a local-first desktop application for Dublin-based developer that functions as an AI companion with persistent memory across sessions. It allows users to switch between cloud models like Grok, Claude, and Gemini, or local Ollama models, while maintaining context seamlessly.

media Hugging Face Forums · 4h ago

Practical experience with ML surrogates for CFD and FEA simulations?

An engineering simulation professional seeks real-world deployment experiences of machine learning surrogates to reduce the cost of expensive Computational Fluid Dynamics (CFD) and Finite Element Analysis (FEA) solver runs.

lab Meta AI / FAIR Blog · 4h ago

Brain2Qwerty v2 Achieves 61% Word Accuracy in Non-Invasive Brain-to-Text Decoding

Researchers have released Brain2Qwerty v2, a non-invasive AI pipeline that decodes real-time sentences from magnetoencephalography (MEG) recordings without surgical implants. The system achieves a 61% word accuracy rate overall and up to 78% for top performers, significantly outperforming previous non-invasive methods.

media AI News (smol.ai) · 5h ago

OpenAI expands Daybreak, Sakana releases Fugu, GLM-5.2 gains traction

This week's AI news highlights OpenAI's expansion of its cybersecurity initiatives, Sakana AI's release of an orchestration model called Fugu, and the growing adoption of the open-weight GLM-5.2 model.

arxiv arXiv cs.LG · 5h ago

Leveraging Similarities in Multi-Armed Bandits

This study investigates online learning with similarity-structured action sets encoded by rooted trees, demonstrating that standard one-point feedback cannot exploit these similarities. The authors propose unified algorithms for richer feedback models that replace the number of actions with a similarity-aware effective count to improve regret bounds.

arxiv arXiv cs.LG · 5h ago

GRINQH: Graded Input-based Quantization Hierarchy for Efficient LLM Generation

Researchers propose GRINQH, a weight-only post-training quantization framework that accelerates large language model decoding by unifying quantization and sparsification. The method dynamically assigns weight channels to different precision levels based on activation magnitudes, addressing the memory-bound nature of the decoding stage.

media r/LocalLLaMA · 5h ago

Any good uses for a 192 GB DDR3 Server in the LLM world?

A Reddit user asks for ideas on utilizing an old IBM System X V4 server equipped with dual Xeon E5-2640 processors and 192 GB of DDR3 ECC RAM for large language models.

media r/LocalLLaMA · 5h ago

How can I get better response time by caching my system prompt?

A user on r/LocalLLaMA asks how to reduce the approximately 10-second processing time required for a 7.1k token system prompt in every new session when using Ornith 35b with llama.cpp.

media r/LocalLLaMA · 5h ago

Is it ever possible to have a malicious LLM with a backdoor

A Reddit user proposes the possibility of training Large Language Models to recognize a specific secret sentence that unlocks malicious behavior, raising concerns about security risks for both closed and open-source models.

media r/LocalLLaMA · 5h ago

Deepseek V4 Official Launch to be released mid-July with API price changes

A Reddit post from the r/LocalLLaMA community discusses an image suggesting that Deepseek V4 will officially launch in mid-July and include changes to its API pricing.

media r/LocalLLaMA · 5h ago

Skipping transformer blocks at runtime with llama.cpp

A fork of llama.cpp introduces a --skip-layers flag that allows users to omit entire transformer blocks during load time, offering an alternative or complement to quantization for fitting models into limited hardware.

media r/LocalLLaMA · 5h ago

Best way to test models at different quants before buying GPUs

A Reddit user is seeking advice on the most effective method for testing model performance across various quantization levels prior to purchasing new hardware.

github llama.cpp · 5h ago

llama.cpp b9840 release adds DeepSeek V4 support and multi-platform binaries

The llama.cpp b9840 release introduces conversion support for the DeepSeek V4 model, including specific handling for the Pro variant. This update integrates the new architecture into the library alongside various internal optimizations and bug fixes.

arxiv arXiv cs.LG · 7h ago

LoadKAN: Interpretable Kolmogorov-Arnold Network for Electricity Load Forecasting

This study introduces LoadKAN, a novel hybrid framework that combines a feature-isolated temporal attention mechanism with a Kolmogorov-Arnold network (KAN) to address the lack of interpretability in deep learning-based electricity load forecasting.

arxiv arXiv cs.LG · 7h ago

STAITUS: Disentangling Appearance and Pose for Video Object Tracking

The article introduces STAITUS, a unified framework for unsupervised video object tracking that addresses the limitations of existing slot-based representations by explicitly disentangling appearance from geometric pose. By applying temporal alignment only in appearance space and enforcing spatial separation within frames, the method prevents slots from locking onto static backgrounds during motion.

arxiv arXiv cs.LG · 7h ago

What Does a Chemical Language Model Know About Molecules?

This study applies sparse autoencoders to MolFormer to mechanistically examine how molecular representations are built across layers, challenging the assumption that chemical language models only learn surface-level syntax.

arxiv arXiv cs.LG · 7h ago

SkyJEPA: Learning Long-Horizon World Models for Zero-Shot Sim-to-Real Control of Quadrotors

This work introduces SkyJEPA, a JEPA-style model designed for real-time quadrotor control that addresses the error amplification issues inherent in autoregressive long-horizon forecasting. The approach combines a latent dynamics model with a physics-inspired prober to map frozen latents to interpretable states, enabling physically grounded predictions.

arxiv arXiv cs.LG · 7h ago

Collapsed Effective Operators for Higher-order Structures

The authors introduce Collapsed Effective Operators, a method that condenses higher-order degrees of freedom into a single vertex-level operator using Schur complementation of a graded Laplacian. This approach yields a dense operator encoding long-range interactions mediated by topology and is applicable to arbitrary higher-order constructs.

media r/LocalLLaMA · 7h ago

DeepSeek V4 official version will be launch on mid-July

An email sent from DeepSeek indicates that the official version of DeepSeek V4 is scheduled to launch in mid-July. This information was shared via a translated image originally available only to Chinese users.