arxiv_cs_lg ยท Jun 21, 2026 ยท paper
Source brief
ORBIT: Training-Free Multi-Attribute Behavioral Steering via Orthogonal Subspace Rotation
arxiv.orgJun 21, 2026
original source linked
In brief
Language models are widely used in assistant settings, where controlling behavioral attributes is often essential. Activation steering modifies hidden-state representations at inference time, providing a lightweight,...
Feed lens
eval