AoiZora is a compiler-mediated topology planner that improves low-latency video diffusion inference on TPU sub-slices. By aligning logical sharding with physical placement through the compilation flow, it reduces one-step denoising latency by up to 1.42x on TPU v5e sub-slices compared to existing methods.
AoiZora: Topology-Aware Auto-Parallel Optimization for Video Diffusion Inference
from English