w$^{2}$VLA introduces a modular approach that decouples declarative and procedural knowledge in Vision-Language-Action models. By restructuring information flow, it enables robust behavior cloning and unprecedented zero-shot skill transfer across unseen, dissimilar objects.
Decoupling Declarative and Procedural Knowledge in Vision-Language-Action Models
from English