A new leaderboard has been introduced to measure large language models' alignment with human humor preferences, addressing the gap in current benchmarks that primarily optimize for reasoning, coding, and math.
- The initiative targets users who interact with AI for enjoyment, companionship, creativity, and entertainment rather than technical tasks.
- It proposes tracking "making people smile" as a meaningful benchmark metric.
- The leaderboard is hosted on Hugging Face Spaces under the name LLM Humor Ranking Leaderboard.
This effort aims to evaluate whether humor alignment should become a standard metric for assessing how well models serve general user needs beyond technical performance.