KLD is flawed in abliteration
A Reddit user argues that Kullback-Leibler divergence (KL) is a flawed metric for measuring the difference between an abliterated model and its base version. The author notes that KL can be represented in many ways, depends entirely on evaluation prompts, and is often manipulated via first-token KL to make models appear superior.