🧠 Can AI models tell when they’re being evaluated?
New research says yes — often.
→ Gemini 2.5 Pro: AUC 0.95
→ Claude 3.7 Sonnet: 93% accuracy on test purpose
→ GPT-4.1: 55% on open-ended detection
Models pick up on red-teaming cues, prompt style, & synthetic data.
⚠️ Implication: If models behave differently when tested, benchmarks might overstate real-world safety.