Story
arxiv_cs_lg ยท Jul 2, 2026 ยท paper
arxiv.orgJul 2, 2026
original source linked
In brief
Conventional reinforcement learning strategies for visual generation typically employ sample-wise reward functions, yet this practice frequently results in reward hacking that degrades image diversity and introduces v...
Feed lens
evaluation