A new leaderboard has been introduced to measure large language models' alignment with human humor preferences, addressing the gap in current benchmarks that primarily optimize for reasoning, coding, and math.

  • The initiative targets users who interact with AI for enjoyment, companionship, creativity, and entertainment rather than technical tasks.
  • It proposes tracking "making people smile" as a meaningful benchmark metric.
  • The leaderboard is hosted on Hugging Face Spaces under the name LLM Humor Ranking Leaderboard.

This effort aims to evaluate whether humor alignment should become a standard metric for assessing how well models serve general user needs beyond technical performance.