Skip to content
Accueil » AI and its contradictions: a study reveals confidence biases in large language models

AI and its contradictions: a study reveals confidence biases in large language models

LLM_bias

Large language models (LLMs), such as Gemma 3, GPT-4o, or o1-preview, are revolutionizing our interaction with technology. Capable of answering complex questions, writing texts, or solving problems, these systems are fascinating due to their power.

But are they as reliable as they seem?

A recent study highlights an intriguing paradox: LLMs exhibit both excessive confidence in their initial responses and a surprising sensitivity to criticism.

This article, written by the Yiaho team, explores these contradictory behaviors, their causes, and what they reveal about the limits of artificial intelligence.

A Paradox at the Heart of AI

At first glance, LLMs seem unshakeable.

When they provide an answer, their assured tone can give the impression that they hold the absolute truth. However, when confronted with contradictory arguments, they can waver, sometimes excessively modifying their responses. This contrast between assurance and doubt has intrigued researchers, who have investigated the mechanisms underlying these behaviors.

To understand this phenomenon, a team designed a novel experiment, leveraging a unique feature of LLMs: the ability to obtain confidence estimates without the model retaining memory of its initial judgments.

This approach, impossible to apply to humans, allows for isolating biases inherent in how AI functions.

Choice-Supportive Bias: AI Clings to Its Ideas

The study reveals a first key mechanism: LLMs suffer from a choice-supportive bias. When a model issues a response, it tends to reinforce its confidence in that response, even in the face of contrary evidence.

This behavior resembles a form of stubbornness: once an LLM has “taken a stance,” it resists changing its mind, as if trying to defend its internal consistency. This bias amplifies its initial confidence, making adjustments difficult, even when new information suggests an error.

This phenomenon has major implications. For example, in a context where an LLM provides medical or legal advice, this obstinacy could lead to erroneous recommendations, especially if the model ignores contradictory data. This behavior is reminiscent of certain human tendencies, but it is exacerbated by the algorithmic nature of LLMs, which lack the cognitive flexibility of a human to re-evaluate their positions.

Also read on this topic: Chat GPT down or bug? And why?

AI: Hypersensitivity to Contradictory Criticism

Paradoxically, the study shows that LLMs are also hypersensitive to contradictory feedback.

When presented with arguments opposing their initial response, they give disproportionate weight to this criticism, sometimes excessively modifying their position. This behavior deviates from the principles of Bayesian updating, a normative statistical method that adjusts beliefs proportionally to new evidence.

Instead, LLMs seem to “panic” when faced with contradictory information, which can lead them to excessively doubt their initial conclusions.

This hypersensitivity can be problematic in interactive scenarios, such as chatbots or virtual assistants.

For example, a user who disputes a response could cause the model to radically change its mind, even if its initial response was correct. This instability harms the reliability of LLMs in contexts where consistency is essential.

See also: Why do OpenAI’s o3 and o4 AIs hallucinate more than other models?

A Unified Explanation for Contradictory Behaviors

Researchers have shown that these two mechanisms – choice-supportive bias and hypersensitivity to criticism – consistently explain LLMs’ behaviors across various domains. Whether in solving mathematical problems, analyzing texts, or making decisions, these biases shape how models process information.

Together, they create a complex dynamic: LLMs cling to their initial responses but can abruptly shift when confronted with criticism, even minor.

This duality reflects a fundamental limitation in LLM design.

Unlike humans, who can balance intuition and reflection, AI models rely on statistical patterns learned from massive data. These patterns, while impressive, do not always reproduce nuanced reasoning, leading to sometimes unpredictable behaviors.

See also: Algorithmic Biases in AI: What are they? And why do they occur?

Implications and Perspectives

These findings raise crucial questions for the future of AI. How can we design more balanced models, capable of re-evaluating their responses without falling into overconfidence or instability? Researchers suggest several avenues, such as integrating more robust self-assessment mechanisms or training models to better weigh contradictory information.

These improvements could make LLMs more reliable in critical applications, such as education, healthcare, or justice. In the meantime, users must keep in mind that an LLM’s assurance does not guarantee its accuracy. Verifying responses and asking critical questions remains essential to get the most out of these technologies.

Conclusion: AI in the Image of Humanity?

The study highlights a fascinating irony: LLMs, though devoid of consciousness, mimic certain human flaws, such as stubbornness or excessive sensitivity to criticism. By understanding these biases, we can not only improve AI performance but also better grasp the limits of artificial intelligence. As these technologies evolve, it is crucial to approach them with a mix of curiosity and caution, recognizing that they are, for now, powerful but imperfect tools.

Source: Arxiv

Leave a Reply

Your email address will not be published. Required fields are marked *

Glen

Glen