Anthropic Reveals Natural Language Autoencoders for Interpreting Claude's Internal Representations

Friday, May 8, 2026