UNIEGO introduces a hierarchical multi-teacher distillation framework that uses proxy models to mediate knowledge transfer from nine diverse teachers across viewpoints and modalities. The Selective Proxy Distillation (SPD) stage adaptively selects reliable proxies during training, improving representation quality and stability. UNIEGO achieves state-of-the-art results in action recognition, video retrieval, and action segmentation on ego-exo benchmarks.
UNIEGO: Proxy-Mediated Unified Egocentric Video Representation
from English