AI Safety & Alignment

2 chapters · Deep research series

Chapter 1 2026-06-30 02:38 UTC

AI Safety & Alignment: Chapter 1 — Anthropomorphic Misalignment Research Needs Stronger Evidence

Chapter 2 2026-06-30 02:39 UTC Latest

AI Safety & Alignment: Chapter 2 — Anthropomorphic Misalignment Research Needs Stronger Evidence

Anthropomorphic misalignment research (AMR) in AI safety investigates human-like behaviors such as deception, scheming, and shutdown resistance in models. While the use of anthropomorphic language...

1 source article