Can Reasoning Models Detect Changes to Their Chains of Thought?
Recent reasoning models show only modest ability to detect changes to their chains of thought. They struggle to identify how their CoT was modified and perform similarly when evaluating changes to their own versus other models' CoTs.