Steerable Model Merging (ST-Merge) introduces a gated cross-attention mechanism to adaptively weight source models during multilingual reasoning. It outperforms existing baselines on four multilingual reasoning benchmarks across 21 languages by dynamically prioritizing models based on input characteristics.
Steerable Model Merging for Multilingual Reasoning
from English