Story
arxiv_llm_reliability ยท Jun 29, 2026 ยท paper
arxiv.orgJun 29, 2026
original source linked
In brief
Traditional automatic evaluation methods have been shown to be unsuitable for modern Chinese poetry because of the distinct nature of this literary genre. Human evaluation remains reliable, but is expensive and not ap...
Feed lens
evaluation