This paper proposes an active, continual learning paradigm for Vision-Language-Action (VLA) models to address the inefficiencies of passive imitation learning. The authors demonstrate that uncertainty-guided data collection improves fine-tuning efficiency but causes catastrophic forgetting when recovery data is used exclusively.
- Active, uncertainty-guided data collection leads to more efficient fine-tuning than passively-collected demonstrations.
- Fine-tuning solely on actively-collected recovery data results in catastrophic forgetting of previously learned behaviors.
- The study evaluates replay-based data mixing and elastic weight consolidation as techniques for continual learning.
- The work establishes tradeoffs between plasticity to new recovery data and retention of existing policy behaviors.
This research highlights the potential of active learning for adaptation efficiency while revealing open challenges in incorporating targeted new data into large robot policies.