All articles
media r/LocalLLaMA · 15h ago

LFM2.5 230M Runs In-Browser at 1,400 tok/s via Custom WebGPU Kernels

The LiquidAI LFM2.5-230M model is now running locally in the browser using custom WebGPU kernels. These specialized kernels were originally developed by Fable 5 prior to its shutdown and Opus 4.8. The demonstration was recorded on an M4 Max device, achieving a generation speed of 1,400 tokens per second. All processing occurs entirely within the user's browser environment without external server dependencies. A GGUF version of the model is available for download on Hugging Face alongside the standard checkpoint. Users can interact with the live demo hosted by the webml-community on Hugging Face Spaces.

media r/LocalLLaMA · 15h ago

Apple to Skip M6 Pro/Max Chips, Fast-Track M7 for Local AI

A recent report indicates that Apple plans to skip the release of M6 Pro and M6 Max chips in its upcoming lineup. Instead, the company intends to fast-track the development of the M7 chip series to better support local artificial intelligence workloads. This strategic shift suggests a prioritization of on-device AI capabilities over traditional performance increments for the Pro tier. The decision reflects Apple's growing emphasis on integrating advanced machine learning features directly into its hardware architecture. By accelerating the M7 timeline, Apple aims to provide more robust neural engine performance for running large language models locally. This move signals a significant pivot in Apple Silicon's development roadmap toward AI-centric design principles.

arxiv arXiv cs.AI · 15h ago

AOHP: An Open-Source OS-Level Agent Harness for Personalized, Efficient and Secure Interaction

The Android Open Harness Project (AOHP) is an open-source operating system-level agent harness built on the Android Open Source Project. It addresses the mismatch between current application-centric operating systems and the needs of autonomous AI agents by treating agents as first-class OS actors. The design introduces three key mechanisms: personalized service composition, efficient agent interfaces, and secure information flow. These features enable adaptive user interfaces and agent-friendly runtime environments while preserving the existing Android ecosystem. Preliminary experiments on challenging tasks demonstrate significant performance improvements over conventional systems. Specifically, AOHP achieved a 21.12% increase in task completion rates compared to baseline methods. It also reduced token execution costs by 51.55%, highlighting its efficiency gains. Furthermore, the system showed improved compliance with security policies during agent-mediated interactions.

arxiv arXiv cs.AI · 15h ago

Rise of Militarized Language in Scientific Abstracts Erodes Credibility

A study analyzing 21.4 million papers from OpenAlex and PubMed reveals that militaristic terms in scientific abstracts rose by 48% and 32%, respectively, between 2010 and 2025. This increase accelerated sharply after 2019 and correlates strongly with global conflict data at both country and annual scales. Social sciences exhibit the highest prevalence of such language, while engineering and computer science show the fastest growth rates. The analysis also notes that the COVID era and the post-2022 large-language-model period narrowed the linguistic gap between native-English and non-English authors. To assess the impact of this trend, researchers conducted a within-subject war-framing experiment involving 801 participants and over 32,000 trials. The experimental results demonstrated that war framing significantly reduced perceived credibility, funding willingness, and policy support among readers. Although there was a trend-level increase in the sense of urgency, the overall findings suggest that militaristic language may undermine the persuasive power of scientific communication.