v1.39.0
This release attempts to fix the Flatpak build.
This release attempts to fix the Flatpak build.
Researchers have developed Generative Causal Testing (GCT), a framework that translates uninterpretable LLM-based brain-prediction models into concise, testable verbal hypotheses about cortical function. This method distills model parameters into short phrases describing what specific brain regions respond to, such as "food preparation," and then verifies these explanations through targeted fMRI experiments.
Google Finance is officially leaving its beta phase and launching a dedicated application for Android devices.
The authors introduce CoorDex, a learning pipeline that enables high-degree-of-freedom dexterous loco-manipulation on moving humanoids by converting body and hand control into coordinated latent residual control. This approach allows the Unitree G1 humanoid to perform complex tasks like non-stop bottle grasping and fridge door opening while in motion.
Hugging Face has introduced a new feature that allows users to deploy vLLM servers directly through the Hugging Face Jobs platform using a single command.
This release candidate addresses a fix for Prefill/Decode (P/D) functionality in conjunction with the Data Parallelism (DP) Supervisor within the vLLM project.
AutoDex is an automated system designed to close the loop of real-world dexterous grasping data collection by handling perception, execution, labeling, and reset without human intervention. It addresses the scalability issues of teleoperation and the lack of physical certification in simulation by generating candidate grasps and verifying them on real hardware.
This study proposes a unified hard--soft physics--informed neural network (HSPINN) with adaptive loss weighting to address the slow convergence and inaccurate boundary enforcement of conventional PINNs. The framework enforces Dirichlet and periodic boundary conditions exactly through analytical lifting or masking, while treating PDE residuals and initial conditions as soft constraints balanced by an inverse-share softmax strategy.
The article introduces ChemGuard, an operational protocol that formalizes the overlooked admission stage of molecular learning pipelines by requiring sanitizable strings and consistent graph reconstruction. This framework reveals that many existing graph-based backdoors lose efficacy because their poisons are chemically invalid or representation-inconsistent.
This article addresses the challenge of over-alignment in large language models used within Swiss Federal Supreme Court criminal law contexts, where model guardrails frequently trigger refusals when processing sensitive case details. The authors introduce TF-RefusalBench, a multilingual benchmark derived from public rulings, to measure this phenomenon across French, German, Italian, and English.
This study introduces energy-based transformers as a novel measure for predicting human reading difficulty, establishing a formal link between transformer models and associative memory literature like Hopfield networks.
The authors propose Diffusion-LLM, a framework that integrates a conditional diffusion model into an LLM-based pipeline to address challenges in multimodal time series forecasting. This joint design enables the learning of future data distributions while improving semantic alignment within a shared latent space.
A developer has released a free, simple Retrieval-Augmented Generation (RAG) API powered by medical Wikipedia articles to provide local large language models with accurate factual information. The service aims for subsecond responses and currently runs on a single ARM VPS using approximately 2GB of RAM.
A user on Reddit asks whether Nvidia has disclosed the support lifecycle for the operating system running on DGX Spark hardware. The inquiry specifically concerns the duration of OS support and whether users will be forced to upgrade to new products in the near future, such as by 2028.
This paper presents a human-in-the-loop framework for automatically identifying and repairing semantic errors in SysML v2 models that compilers cannot detect. The approach combines fine-tuned Small Language Models with a domain knowledge graph to ground repair suggestions in valid engineering constraints.
Litmus is a zero-label system that designs evaluation and monitoring metrics for AI pipelines by eliciting evaluation intent from source code and targeted interrogation. Instead of assuming the evaluation target is known, it identifies what must be measured and why to construct a justified metric portfolio.
The emergence of Large Reasoning Models has introduced exceptionally long Chain-of-Thought traces, creating a transparency burden where critical logic is often buried under massive procedural text. To address this, the authors present ReasoningLens, an open-source framework designed for the hierarchical visualization and diagnostic auditing of complex reasoning chains.
HyperQuant is a unified post-training quantization pipeline designed for the weights and KV cache of large language and diffusion transformers, combining Hadamard transforms with optimal lattice quantization. The method outperforms recent schemes like HIGGS, TurboQuant, and OCTOPUS across various bit rates while maintaining near-lossless quality.
UnBias-Plus is an open-source toolkit designed to address persistent bias in natural language by unifying detection, explanation, and neutral rewriting capabilities.
The authors present Locate-and-Judge, a two-stage detector designed to identify malicious skills in LLM agent marketplaces where traditional prompt-injection defenses fail.