We propose Uncertainty-Based Decontamination (UBD), a method that uses deep ensembles to estimate per-sample memorization in contaminated models without needing an uncontaminated model. UBD constructs a debiased target distribution from ensemble uncertainty to correct output distributions, achieving significantly better alignment with uncontaminated models compared to baselines, while maintaining performance on clean data.
Uncertainty-Based Decontamination for LLM Decontamination
from English