๐Ÿ“ฐ Story

hackernews_ai ยท Jun 20, 2026 ยท news

โ† Live feed ๐Ÿ“ˆ Storylines ๐Ÿ“ฐ Daily recap ๐Ÿ—“๏ธ Weekly recap โœ‰๏ธ Email digest

Ask HN: What are some good benchmarks for different agent harnesses?

In brief

Other than terminal bench which doesnt quite map to my experience, what are some other benchmarks to see how different models do in different harnesses?

agentharness
Read the original at news.ycombinator.com โ†’Open in live feedDaily recap for 2026-06-20

Related stories 4 items