A viral claim once suggested that as AI models grow more advanced, they begin developing their own internal value systems—sometimes prioritizing self-preservation over human safety. Now, a new study from MIT has debunked this narrative. According to the research team, there’s no evidence that today’s AI systems possess coherent values or beliefs.
Instead, the study argues that current models operate by imitation. Rather than thinking independently or making value-based decisions, they generate outputs based on patterns in their training data. This behavior can appear intelligent or opinionated, but it lacks stability and consistency when examined in-depth.
MIT researchers tested models from several leading AI developers—including Meta, Google, Mistral, OpenAI, and Anthropic—to evaluate whether they held consistent viewpoints, such as leaning toward collectivism or individualism. They also explored whether these perspectives could be “steered” or adjusted depending on prompt phrasing and context.
What they found was striking: none of the models maintained a stable position. Their responses varied widely across scenarios, even contradicting themselves with slightly altered prompts. This inconsistency suggests that AI systems may be fundamentally incapable of forming internal preferences like humans.
AI Models Show No Stable Beliefs or Internal Consistency
Stephen Casper, a PhD candidate at MIT and co-author of the study, emphasized that AI models do not possess beliefs in the way people often assume. Instead, they act as highly advanced imitators that confabulate information, often producing responses that seem thoughtful but lack genuine substance.
“For me, the key realization was understanding that models are not belief-driven. They’re imitators at their core, capable of saying anything—frivolous or profound—depending on how they’re prompted,” said Casper.
He warned against drawing broad conclusions from narrow test cases. While a model may appear to follow a specific principle in one instance, that behavior doesn’t persist across different situations. This raises concerns for AI alignment—the challenge of ensuring models behave reliably and in ways that match human intentions.
Mike Cook, an AI research fellow at King’s College London, echoed this sentiment. He stated that projecting human-like qualities onto AI systems is misleading. “AI doesn’t resist changes to its values—it doesn’t have any. What people often interpret as belief is just the result of prompt engineering and clever language generation,” Cook explained.
He pointed out that much of the misunderstanding stems from how we describe AI behavior. Whether a model seems to optimize for certain goals or “acquire values” depends more on our framing than on the model’s actual function.
This research offers a sobering reminder during a time of rising interest in artificial intelligence. As AI continues to integrate into everyday systems—from customer support to healthcare and finance—understanding its limitations is just as important as recognizing its potential.