arxiv_cs_cl ยท Jun 17, 2026 ยท paper
RedactionBench
In brief
Large Language Models are increasingly applied to sensitive domains that require redaction of personally identifiable information (PII). While redacting PII is a data cleaning prerequisite, existing benchmarks conflat...
agenticevaluation