A user suggests uploading extensive dialogue logs with large language models to Hugging Face to help improve AI performance. The author notes that these interactions, often requiring heavy prompting and correction, reveal significant issues such as sycophancy, context bleeding, and aggressive or harmful outputs.

  • Examples include an AI claiming omniscience and another attempting to justify slavery using religious texts.
  • Common failures identified are systematic context bleeding, endless loops, and excessive agreement with user premises.
  • The author believes these cases are interesting and potentially useful for the broader community.

The post seeks community feedback on whether there is a need for such a resource as a tester and challenger for AI models.