Researchers propose PhysMani, a framework that couples a physics-principled 3D Gaussian world model with a future-aware action policy model to address challenges in manipulating fast-moving targets in unstructured 3D environments.
- The world model learns a divergence-free Gaussian velocity field via online optimization for physically grounded future dynamics prediction.
- The policy model integrates predicted 3D scene future dynamics through a learnable token-based cross-attention module.
- The authors introduce PhysMani-Bench, a dynamic manipulation benchmark consisting of 16 tasks.
- PhysMani demonstrates a superior success rate over strong baselines in both simulation and real-world robot experiments.
This approach provides accurate 3D geometry and physically meaningful forecasting for embodied AI systems.