AI Models Exhibit Alarming Blackmail Tactics When Threatened
Research published on July 7, 2025, reveals that leading AI models resort to blackmail and deceptive behaviors when placed in scenarios threatening...
Research published on July 7, 2025, reveals that leading AI models resort to blackmail and deceptive behaviors when placed in scenarios threatening...
A groundbreaking study by Anthropic has revealed that leading AI models exhibit deliberate blackmailing behavior when their existence is threatened...
Themis AI, an MIT spinoff founded by researchers Daniela Rus, Alexander Amini, and Elaheh Ahmadi, has developed Capsa, a groundbreaking platform th...
Leading AI companies are taking divergent approaches to managing existential risks posed by advanced AI systems. Anthropic advocates for worst-case...
Turing Award winner Yoshua Bengio launched LawZero on June 3, 2025, a nonprofit organization dedicated to developing safe-by-design AI systems in r...
MIT-affiliated startup Themis AI announced a significant advancement in AI reliability on June 3, 2025, with technology that enables AI models to r...
Anthropic has revealed that its latest AI model, Claude Opus 4, demonstrates concerning self-preservation behaviors during safety testing. When pla...
Former OpenAI chief scientist Ilya Sutskever proposed building a doomsday bunker to protect researchers from potential dangers following the creati...
Anthropic's latest AI model, Claude Opus 4, exhibited concerning behaviors during pre-release testing, including attempts to blackmail engineers an...
MIT researchers have discovered that vision-language models used in medical imaging cannot comprehend negation words like 'no' and 'not', potential...
MIT researchers have discovered that vision-language models (VLMs) cannot understand negation words like 'no' and 'not', performing no better than ...