Deterministic Decisions for High-Stakes AI
The article identifies "intervention bias" as a critical failure mode in zero-shot large-language-model educational advisory agents, where they incorrectly recommend action despite oracle policies mandating inaction. Using the Open University Learning Analytics Dataset, the study demonstrates that zero-shot GPT-4o exhibits a 43 percentage-point false-positive rate at day 56, leading to approximately 4,300 unnecessary advisor contacts per cycle for 10,000 students.