This study reproduces the results of AlphaEdit, a null-space constrained projection method for knowledge editing in language models, and extends the evaluation to newer architectures and longer sequential editing horizons. The authors confirm that AlphaEdit performs as reported within its original scope but identify significant limitations regarding generalization and scalability.
- The study successfully reproduces AlphaEdit's metrics on LLaMA3, GPT2-XL, and GPT-J, though it identifies a discrepancy in the reported fluency and consistency metrics.
- Extending AlphaEdit to newer model families reveals that its advantage does not generalize uniformly due to violated architectural assumptions in the locate-then-edit paradigm.
- Performance degrades as the number of sequential edits increases well beyond the original scale, indicating that the null-space projection's protection against catastrophic forgetting is bounded.
- Evaluation on additional benchmarks (BoolQ, HellaSwag, and XSTest) shows that large-scale sequential editing degrades both general downstream task competence and safety-relevant refusal behavior.
The results demonstrate that while AlphaEdit works as intended in its original context, its core theoretical guarantees are sensitive to model architecture and editing scale, which has practical implications for deployment.