The article demonstrates that field order significantly impacts retrieval quality in structured metadata systems because standard fine-tuning causes encoders to rely on absolute position rather than field labels. To address this, the authors propose Permutation-Invariant Fine-Tuning (PI-FT), a method that serializes records under randomly sampled field orders with dropout to bind meaning to labels.
- Standard fine-tuning loses 7.4 nDCG@10 points when index field order changes, whereas PI-FT reduces this penalty to 0.2 points.
- The approach uses a data loader modification that samples fresh field orders and applies random field dropout during training.
- A fine-tuned 118M-parameter CPU encoder achieves 0.707 nDCG@10 on the new DevDataBench, outperforming zero-shot baselines like text-embedding-3-large (0.556).
- The benchmark covers grounded queries across 15 languages for nearly 10,000 development statistics indicators.
This method ensures that retrievable data remains discoverable regardless of schema variations, which is critical for AI agents mediating access to public statistics where usage logs cannot provide training signals for unsearched indicators.