Researchers introduce PaperPilot, a multi-turn literature search agent that frames scientific search as workflow induction to address underspecified and evolving user intents. Given an anchor paper and query, the system constructs an executable DAG of search operators which can be refined through user feedback.

  • PaperPilot-9B improves over the base Qwen3.5-9B toolset agent under multi-turn interaction.
  • Hit@5 increases from 58.0 to 77.0, MRR from 47.5 to 59.4, and nDCG@10 from 26.8 to 32.5.
  • Workflow execution errors are reduced from 9.5% to 0%.

The results demonstrate that explicit, editable search workflows provide an effective and controllable interface for aligning literature search agents with complex scientific intent.