This study investigates whether large language models can recover the statistical characteristics of a broader population using only a small pilot sample of human responses. The authors decompose this recovery into three axes: structural fidelity, marginal fidelity, and individual fidelity.
- The research benchmarks prompting, rectification, and fine-tuning approaches using a COVID-19 misinformation survey as a case study.
- Findings indicate that fine-tuning on small pilot samples provides a balanced approach for achieving multiple forms of fidelity.
- The levels of fidelity achieved through fine-tuning can vary across subsamples, which may threaten pluralistic alignment.