A cheap trick for reliable structured output: feed the validation error back into the retry

To improve reliability when generating structured output from large language models, a method is proposed that feeds validation errors and the model's previous output back into the prompt during retries. This approach transforms the process from re-rolling random responses to self-correcting specific errors by editing the prior attempt.

The technique involves catching validation errors and appending a message containing the formatted error and the serialized previous response to the next prompt.
Errors must be described in terms understandable to the model, such as specifying that a field requires an integer but received a string.
The model's own prior output is included so it can edit the specific incorrect parts rather than regenerating the entire response.
Tradeoffs include increased latency from extra calls and longer prompts on failures, requiring a cap on attempts.
This method only works when the invalid output is parseable enough to be fed back into the model.

This strategy helps users achieve more reliable structured outputs by leveraging the model's ability to correct its own mistakes rather than relying on random retries.