RefRad2D is a large-scale bilingual dataset of 1.2M CT and MR image-text pairs from clinical practice. Trained on this data, RadGrounder achieves competitive VQA results and performs spatial grounding without degrading language quality, enabling verifiable outputs in radiology.
RefRad2D Dataset Enables Scalable Spatial Grounding in Radiology
from English