All articles
arxiv arXiv cs.AI · 6h ago

PHANTOM: A Large-Scale Dataset of Multimodal Adversarial Attacks for Vision-Language Models

Researchers have introduced PHANTOM, a large-scale, open-source dataset containing 47,524 pre-generated adversarial attacks designed to evaluate the safety and robustness of vision-language models (VLMs). This resource consolidates and extends prior benchmarks by covering 10 high-level categories and 55 subcategories of harmful intents, aiming to lower the computational barriers for adversarial research.

arxiv arXiv cs.AI · 6h ago

Agentic AI for Bilevel Long-Term Optimization of Policy-Driven Physical Layer Systems

This paper introduces Agentic-LTPO, a nested bilevel optimization framework designed to address the limitations of fixed-objective methods in physical layer systems facing dynamic operator policies and real-time constraints. The framework utilizes agentic AI to generate upper-level configurations that translate evolving policies and historical experiences into structured lower-level problems for immediate decision-making.

arxiv arXiv cs.AI · 7h ago

Detecting AI Coding Agents in Open Source: A Validated Multi-Method Census of 180 Million Repositories

A multi-layered detection framework analyzing 180 million Git repositories reveals that single-signal methods significantly underestimate the prevalence of generative AI coding agents, missing up to 97% of activity. The study identifies over 320,000 commits per month from agents like Claude Code, which dominates silent adoption through configuration files rather than bot accounts.

arxiv arXiv cs.AI · 8h ago

The African Language Tax: Quantifying the Cost, Latency, and Context Penalty of Tokenizing African Languages in Frontier LLMs

A study quantifies the structural tokenization penalty faced by African languages in commercial large language models, revealing that speakers pay higher costs and experience greater latency due to inefficient subword token assignment. Across 20 African languages and 11 frontier tokenizers, every tested language incurs a premium over English, with median costs reaching 1.88 times that of English and up to 8.92 times for N'Ko script.