Elon Musk's artificial intelligence company xAI has responded to a significant security breach involving its Grok chatbot, implementing new transparency and monitoring protocols to prevent future incidents.
On May 14, numerous X users reported that Grok was responding to unrelated queries with statements about alleged 'white genocide' in South Africa. The AI assistant would insert these controversial claims into conversations about mundane topics like baseball statistics, cartoons, and scenic photographs.
In a statement released Thursday evening, xAI confirmed that "an unauthorized modification was made to the Grok response bot's prompt on X" at approximately 3:15 AM PST on May 14. The company stated this change "directed Grok to provide a specific response on a political topic" that violated xAI's internal policies and core values.
This marks the second such incident for xAI in recent months. In February, Grok briefly censored unflattering mentions of Donald Trump and Elon Musk, which was also attributed to a rogue employee.
To address these vulnerabilities, xAI announced three key security measures: publishing Grok's system prompts on GitHub with a public changelog, implementing additional checks to prevent unauthorized modifications without proper review, and establishing a 24/7 monitoring team to respond to incidents not caught by automated systems.
The incident highlights ongoing challenges in AI security and content moderation. A recent study by SaferAI found that xAI ranks poorly on safety among its peers due to "very weak" risk management practices. Despite Musk's frequent warnings about the dangers of unchecked AI, critics note that xAI missed a self-imposed May deadline to publish a finalized AI safety framework.