Berkeley Researchers Expose AI Agent Benchmark Crisis: Perfect Scores Without Solving Tasks

Sunday, April 12, 2026