1sec.ai

Tag

#open-ended-qa

Every item tagged open-ended-qa, newest first.

1 item

RECOM: A Validity Discrimination Tradeoff in Automatic Metrics for Open Ended Reddit Question Answering

Researchers introduce RECOM, a new evaluation dataset for open-ended Reddit question answering that prioritizes validity discrimination over discriminative power. RECOM contains 15,000 r/AskReddit questions from September 2025, focusing on content alignment over system ranking. This dataset aims to help builders develop and evaluate LLMs that generate high-quality, genuinely aligned responses.

Key takeaways
  • RECOM dataset prioritizes validity over discriminative power in evaluating LLM responses.
  • Contains 15,000 r/AskReddit questions from September 2025.
  • Focuses on content alignment over system ranking.