Output-Space Allocation Costs for Calibration-Guided LLM Compression: An Empirical Study
This study investigates whether aligning allocation costs with output-space objectives improves the fidelity of compressed large language models, specifically testing a modification to the ROCKET compression method. The authors compare using weight-space Frobenius error against an output reconstruction objective for multi-choice knapsack problem allocation.