Researchers propose a Judge-Aware Gated Multi-Task Learning architecture that disentangles objective case facts from adjudicative context to improve legal outcome prediction. The model uses a fine-grained outcome taxonomy and a gated fusion mechanism to dynamically modulate reliance on judge identity, evaluated on 13,937 UK Employment Tribunal decisions.

  • Benchmarked against supervised fine-tuning of a Gemma-4 26B-A4B backbone where judge identity is injected as prompt tokens or output targets.
  • Achieves state-of-the-art results with an order of magnitude fewer trainable parameters than generative SFT baselines.
  • Gains are concentrated on the most ambiguous and rarest outcome classes, demonstrating superior parameter efficiency.
  • The architecture provides interpretability by localizing cases where adjudicative context drives predictions through learned judge embeddings.

The study concludes that for identity-conditioned classification of legal outcomes, differentiable structured composition yields more accurate and efficient models than prompt-based composition over larger backbones.