Ornith 35B works reasonably well with Qwen3.6 35B DFlash speculative model
A user reports achieving a 30-40% increase in token generation speed by pairing the Ornith-1.0-35B model as a draft model with Qwen3.6-35B-A3B-DFlash using llama-server.
A user reports achieving a 30-40% increase in token generation speed by pairing the Ornith-1.0-35B model as a draft model with Qwen3.6-35B-A3B-DFlash using llama-server.
Researchers have introduced PHANTOM, a large-scale, open-source dataset containing 47,524 pre-generated adversarial attacks designed to evaluate the safety and robustness of vision-language models (VLMs). This resource consolidates and extends prior benchmarks by covering 10 high-level categories and 55 subcategories of harmful intents, aiming to lower the computational barriers for adversarial research.
This article introduces Female-RHINO, a real-time AI-assisted framework that integrates with MRI scanners to perform automated quantitative uterine analysis and structured reporting during image acquisition. The system combines deep learning models for segmentation and landmark detection to derive biomarkers from sagittal T2-weighted pelvic MRI without manual interaction.
The authors introduce Age of LLM, a turn-based 1v1 benchmark where two large language models compete on a 13x7 grid to destroy an enemy base under conditions of fog of war and full diplomacy. This private engine mitigates data contamination by using fresh random map seeds and opponents for each match.
The article introduces ATRIA, a multi-agent system for ECG reporting that addresses the limitations of existing end-to-end models and single-pass agents by mirroring the clinician's iterative workflow.
This study evaluates whether any single decoding pipeline dominates across subjects in motor imagery brain-computer interfaces by testing 1,056 configurations on three public datasets using rigorous statistical benchmarks.
This article addresses the problem of resolving entities in large datasets using an oracle that clusters records in limited batches, aiming for a pay-as-you-go approach to control costs while maximizing recall.
This paper introduces Agentic-LTPO, a nested bilevel optimization framework designed to address the limitations of fixed-objective methods in physical layer systems facing dynamic operator policies and real-time constraints. The framework utilizes agentic AI to generate upper-level configurations that translate evolving policies and historical experiences into structured lower-level problems for immediate decision-making.
Chris Tidesson announces the founding of Second Circuit, an NGO dedicated to supporting self-determined AI use and encouraging open-source software adoption among governments, companies, and private individuals. The organization was originally established in response to the ChatGPT 4o situation and currently operates a Discord community for over six months.
This Reddit post from the r/LocalLLaMA community discusses a statement made by Dario Amodei. The content is limited to the title and metadata, with no detailed text or analysis provided in the source.
This study evaluates whether spectral filtering can accelerate continuous subgraph matching (CSM) on dynamic graphs, finding that while lazy maintenance is ineffective, selective exact maintenance offers significant performance gains.
A multi-layered detection framework analyzing 180 million Git repositories reveals that single-signal methods significantly underestimate the prevalence of generative AI coding agents, missing up to 97% of activity. The study identifies over 320,000 commits per month from agents like Claude Code, which dominates silent adoption through configuration files rather than bot accounts.
This paper investigates how classical image transformations affect embeddings in latent space using encoder networks from Lunit Inc., Bioptimus, and Meta Research Team.
This article introduces PCFM, a flow matching approach for medical point cloud completion that integrates Point Transformer v3 (PTv3) to address insufficiently studied generative modeling in this domain. The method is evaluated on the SkullFix, SkullBreak, and Mandibular Defect datasets against strong deterministic and diffusion baselines.
The authors propose ReM-MoA, a memory-augmented Mixture-of-Agents framework designed to sustain performance gains as model depth increases, addressing the degradation and saturation issues found in existing variants. The system utilizes a Ranked Reasoning Memory and a Curated Diversified Memory Routing scheme to preserve exploration diversity while propagating high-quality reasoning traces across layers.
Researchers propose NoContactNoWorries, a transformer-based framework that infers binary contact states during in-hand manipulation by fusing RGB-D vision with robot proprioception. This approach serves as a scalable pseudo-tactile signal, avoiding the cost and fragility associated with dedicated hardware tactile sensors.
This article introduces a Bayesian controller for orchestrating modern coding agents, addressing the limitations of fixed-rule systems that ignore uncertainty during tool use.
The provided source content is a Reddit submission link and does not contain the article text or discussion details.
A Reddit user proposes that OpenAI should launch a powerful open-source model, referred to as GPT-OSS-2, timed with Anthropic's upcoming IPO.
A developer has released an optimized C++ implementation of Qwen3-TTS, achieving approximately 5x realtime speed on an RTX 5080, alongside a cross-platform desktop GUI built with Kotlin Compose Multiplatform. The project provides GGML-based inference that supports both CPU and CUDA execution on Windows and Linux.