The article introduces a robust Q-learning algorithm for discrete-time mean-field control problems with Wasserstein uncertainty in common noise. It combines quantization-and-projection with a Wasserstein dual reformulation and establishes convergence with finite-time bounds for both synchronous and asynchronous schemes. Numerical experiments on systemic risk and epidemic models demonstrate the robustness-performance tradeoff and convergence of the asynchronous implementation.
Robust Q-learning for mean-field control under Wasserstein uncertainty
from English