All articles
arxiv arXiv cs.AI · 20h ago

Measuring & Mitigating Over-Alignment for LLMs in Multilingual Criminal Law Courts

This article addresses the challenge of over-alignment in large language models used within Swiss Federal Supreme Court criminal law contexts, where model guardrails frequently trigger refusals when processing sensitive case details. The authors introduce TF-RefusalBench, a multilingual benchmark derived from public rulings, to measure this phenomenon across French, German, Italian, and English.

media r/LocalLLaMA · 22h ago

LFM2.5 230M Runs In-Browser at 1,400 tok/s via Custom WebGPU Kernels

The LiquidAI LFM2.5-230M model is now running locally in the browser using custom WebGPU kernels. These specialized kernels were originally developed by Fable 5 prior to its shutdown and Opus 4.8. The demonstration was recorded on an M4 Max device, achieving a generation speed of 1,400 tokens per second. All processing occurs entirely within the user's browser environment without external server dependencies. A GGUF version of the model is available for download on Hugging Face alongside the standard checkpoint. Users can interact with the live demo hosted by the webml-community on Hugging Face Spaces.

media r/LocalLLaMA · 22h ago

Apple to Skip M6 Pro/Max Chips, Fast-Track M7 for Local AI

A recent report indicates that Apple plans to skip the release of M6 Pro and M6 Max chips in its upcoming lineup. Instead, the company intends to fast-track the development of the M7 chip series to better support local artificial intelligence workloads. This strategic shift suggests a prioritization of on-device AI capabilities over traditional performance increments for the Pro tier. The decision reflects Apple's growing emphasis on integrating advanced machine learning features directly into its hardware architecture. By accelerating the M7 timeline, Apple aims to provide more robust neural engine performance for running large language models locally. This move signals a significant pivot in Apple Silicon's development roadmap toward AI-centric design principles.