All articles — korshunov.ai

All articles Page 1 / 116

Skipping transformer blocks at runtime with llama.cpp

A fork of llama.cpp introduces a --skip-layers flag that allows users to omit entire transformer blocks during load time, offering an alternative or complement to quantization for fitting models into limited hardware.

media r/LocalLLaMA · 3h ago

Best way to test models at different quants before buying GPUs

A Reddit user is seeking advice on the most effective method for testing model performance across various quantization levels prior to purchasing new hardware.

github llama.cpp · 3h ago

llama.cpp b9840 release adds DeepSeek V4 support and multi-platform binaries

The llama.cpp b9840 release introduces conversion support for the DeepSeek V4 model, including specific handling for the Pro variant. This update integrates the new architecture into the library alongside various internal optimizations and bug fixes.

arxiv arXiv cs.LG · 4h ago

LoadKAN: Interpretable Kolmogorov-Arnold Network for Electricity Load Forecasting

This study introduces LoadKAN, a novel hybrid framework that combines a feature-isolated temporal attention mechanism with a Kolmogorov-Arnold network (KAN) to address the lack of interpretability in deep learning-based electricity load forecasting.

arxiv arXiv cs.LG · 4h ago

STAITUS: Disentangling Appearance and Pose for Video Object Tracking

The article introduces STAITUS, a unified framework for unsupervised video object tracking that addresses the limitations of existing slot-based representations by explicitly disentangling appearance from geometric pose. By applying temporal alignment only in appearance space and enforcing spatial separation within frames, the method prevents slots from locking onto static backgrounds during motion.

arxiv arXiv cs.LG · 4h ago

What Does a Chemical Language Model Know About Molecules?

This study applies sparse autoencoders to MolFormer to mechanistically examine how molecular representations are built across layers, challenging the assumption that chemical language models only learn surface-level syntax.

arxiv arXiv cs.LG · 4h ago

SkyJEPA: Learning Long-Horizon World Models for Zero-Shot Sim-to-Real Control of Quadrotors

This work introduces SkyJEPA, a JEPA-style model designed for real-time quadrotor control that addresses the error amplification issues inherent in autoregressive long-horizon forecasting. The approach combines a latent dynamics model with a physics-inspired prober to map frozen latents to interpretable states, enabling physically grounded predictions.

arxiv arXiv cs.LG · 4h ago

Collapsed Effective Operators for Higher-order Structures

The authors introduce Collapsed Effective Operators, a method that condenses higher-order degrees of freedom into a single vertex-level operator using Schur complementation of a graded Laplacian. This approach yields a dense operator encoding long-range interactions mediated by topology and is applicable to arbitrary higher-order constructs.

media r/LocalLLaMA · 4h ago

DeepSeek V4 official version will be launch on mid-July

An email sent from DeepSeek indicates that the official version of DeepSeek V4 is scheduled to launch in mid-July. This information was shared via a translated image originally available only to Chinese users.

media r/LocalLLaMA · 4h ago

Slow performance Unsloth Gemma 12B Q8

A user reports a significant drop in inference speed when switching from GPT-OSS 20B Q4 to Gemma 4 12B Q8 using llama.cpp, with throughput falling from approximately 70 tokens per second to 10 tokens per second. The issue persists even when testing a Q5 model variant and disabling the thinking feature, which only yielded a marginal gain of two additional tokens per second.

github llama.cpp · 4h ago

llama.cpp b9839 release with Tailwind scanning fix and multi-platform binaries

The llama.cpp project has released version b9839, which includes a fix to restore Tailwind scanning in ignored worktrees. This update provides pre-built binaries for macOS, Linux, Android, Windows, and openEuler across various architectures and hardware acceleration backends.

lab OpenAI News · 4h ago

Mapping Europe’s AI Workforce Opportunity

OpenAI Economic Research has extended its AI Jobs Transition Framework to the European Union, utilizing ESCO taxonomy and Eurostat data to analyze how AI capabilities may reshape labor markets across member states.

arxiv arXiv cs.LG · 5h ago

Selective Time Series Forecasting via Metalearning

This article introduces a selective forecasting framework that allows models to abstain from high-risk predictions by modeling the empirical percentile of forecasting errors through metalearning. By using scale-invariant statistics derived from recent lags, the method decouples rejection decisions from forecasts to enable transfer across heterogeneous time series.

arxiv arXiv cs.LG · 5h ago

Do Location Encoders Capture Spatial Effects? A GeoShapley Benchmark Across Scales

This study benchmarks whether GeoShapley, a game-theoretic explainer, can recover spatially varying coefficients from machine learning models using location encoder embeddings. Eleven encoders from the TorchSpatial framework were evaluated against a synthetic process with known coefficients across grid, county, and global scales.

arxiv arXiv cs.LG · 5h ago

Time Series Classification through Diffeomorphic Time Warping (DiffTW)

The article introduces Diffeomorphic Time Warping (DiffTW), a theoretical framework for time series classification that learns mappings between real-valued functions to overcome the discrete point matching limitations of Dynamic Time Warping (DTW). DiffTW approximates diffeomorphic transformations using the method of characteristics to solve linear transport equations, providing a theoretically grounded dissimilarity measure.

arxiv arXiv cs.LG · 5h ago

Sublinearly Structured Deep Neural Networks Achieve Feature Learning Consistency for Compositional Functions

This study establishes feature-learning consistency guarantees for a broad subclass of deep neural networks characterized by sublinear growth in input/output dimensions and hidden neurons relative to sample size. The authors prove that these architectures achieve universal approximation for hierarchically compositional functions, even within the conventional over-parameterized regime where parameters exceed training samples.

arxiv arXiv cs.LG · 5h ago

TROPT: An Open Framework for Unifying and Advancing Discrete Text Optimization

TROPT is introduced as the first open-source framework that unifies discrete text-trigger optimization by standardizing execution and development under a single interface. It addresses current fragmentation by allowing users to customize end-to-end optimization recipes through interchangeable models, objectives, and optimizers.

arxiv arXiv cs.LG · 5h ago

FLKit: A Structured Onboarding Toolkit for Federated Learning in Health

FLKit is an open, community-maintained onboarding toolkit designed to help multidisciplinary teams navigate the federated learning lifecycle in health and life sciences research. It provides role-aware entry points for clinical, legal, governance, and technical contributors, addressing the practical barriers of scattered frameworks and governance obligations.

arxiv arXiv cs.LG · 5h ago

FairBED: A Bayesian Experimental Design Approach to Gathering Fairer Data

The article introduces FairBED, a framework that modifies the data acquisition process itself to gather inherently fairer data, addressing biases present in existing datasets. It provides novel formulations for quantifying dataset fairness based on the principle that fair datasets should be uninformative about sensitive attributes.

media r/LocalLLaMA · 5h ago

DeepSeek V4 by am17an · Pull Request #24162 · ggml-org/llama.cpp

A pull request submitted to the ggml-org/llama.cpp repository enables local execution of the DeepSeek V4 model.