Fine-tuning LLMs for Passive Depression Severity Estimation
A model fine-tuned on Qwen3.5-27B predicts PHQ-9 scores from AI dialogue transcripts, achieving MAE=2.6 and AUC=0.91 at the PHQ-9 >= 10 threshold. It maintains AUC > 0.87 across all PHQ-9 severity levels, demonstrating accurate depression severity estimation in real-world conversations without self-reporting.