PhysAssistBench introduces a benchmark for interactive doctor-patient-EHR assistance using real MIMIC-IV cases. It features 1,296 manually reviewed, physician-validated turns and reveals that current LLMs struggle with coordinating clinical knowledge, communication, and EHR system interaction.
PhysAssistBench Evaluates LLMs in Doctor-Patient-EHR Interaction
from English