TRAP: Benchmark for Task-completion and Resistance to Active Privacy-extraction
TRAP evaluates how well models complete tasks using private data without leaking it. Across 22 models, all show non-trivial privacy leakage, with instruction-following ability linked to higher leakage. Structural private field isolation prevents leakage by replacing private fields with hash keys, maintaining task accuracy without sacrificing privacy.