FAST: A Framework for Aligned Sampling and Training in Parallel Reinforcement Learning
FAST addresses sampling inefficiency in autonomous driving reinforcement learning by introducing Dynamic Parallel Sampling Alignment to decouple sampling loops from individual episode terminations. It achieves up to 1.78 times wall-clock speedup over single-clip baselines while maintaining statistical unbiasedness through Scaled Mask-Padding Optimization.