It’s easy to get excited about the power of AI agents and everything they can accomplish autonomously, saving us considerable amounts of time and effort. State-of-the-art agents are already useful in many ways, and we can imagine even more capabilities that seem just around the corner. Academic research is showing us what is possible, and it’s only a matter of time until these new capabilities appear on the market in software products.
This gap between what’s theoretically possible and what we can successfully put into production comes down to two opposing forces: confidence and uncertainty.
Of all the factors that influence an AI agent’s success — software, data, architecture and more — perhaps one of the most critical but least discussed is uncertainty. Without high confidence that an agent will make good decisions and take an appropriate course of action, we shouldn’t trust it in production. And, this high confidence develops only if the agent is provably effective at dealing with uncertainty in its many forms.
Let’s discuss the concept of uncertainty within agentic workflows and why it is important to explicitly acknowledge and address it. I will describe some strategies that can reduce critical uncertainties and prevent related problems from arising.
What Is Uncertainty in the Context of Agentic Workflows?
Uncertainty, in general, is the condition of not knowing something that would be relevant, whether it be facts from the past/present or something that might happen in the future.
In agentic architectures, uncertainty can arise from ambiguous instructions, missing or unreliable data, limitations in the agent’s ability to reason through complex decisions and for many other reasons. When an AI agent tries to operate in the presence of high uncertainty — and if the agent doesn’t acknowledge and handle it properly — it may struggle to determine the correct course of action, leading to mistakes, inaction or negative outcomes.
In order to handle uncertainties that arise, agents must be able to assess the situation, including all relevant information and resources, at each stage of a task — whether interpreting a request, retrieving necessary information, or making a decision. If there is too much uncertainty to make a decision and take action safely, the agent needs to realize this and take steps to either reduce it — by seeking clarification, checking additional resources or refining its reasoning.
In agentic architectures, uncertainties can fall into the following categories:
- Uncertainty in goals – Goals may be ambiguous or unclear.
- Uncertainty in resources – Some necessary resources may not be available or reliable.
- Uncertainty around unknowable information – Relevant future events and other facts that can’t be known (yet) for any other reason.
- Uncertainty in reasoning – Complex or difficult reasoning tasks may be too tough to solve with certainty.
When an issue around uncertainty presents itself, it is helpful to determine which type of uncertainty is present, because that helps us form a strategy to find a resolution — and it also can help us design an agent to find its own resolution autonomously. Even if an uncertainty doesn’t fit squarely into one category, which is common, noticing which of the categories align most closely with the issue can help devise a strategy to address it.
Uncertainty is not a flaw in AI agents; it’s an inherent part of decision-making in dynamic real-world environments. To build effective AI agents, we don’t need to eliminate uncertainty entirely, but we do need to design systems that can recognize and manage it. Strategies such as confidence estimation, fallback mechanisms, human-in-the-loop intervention, risk-aware reasoning, guardrails and specialization can help reduce uncertainty as well as risk.
Let’s look more closely at several of these strategies with which agentic architectures can improve reliability, ensuring that agents make well-informed decisions and act in ways that align with user expectations.
Strategies to Reduce Uncertainty in Agents
There are many ways to reduce uncertainty in agent workflows. None of them are foolproof, but some can be very effective, depending on the use case. Some helpful strategies for reducing uncertainty are:
- Human in the loop: Pass the most difficult cases to a person.
- High awareness of uncertainty: The agent realizes it’s uncertain and stops to think a bit more.
- Secondary checking: Reconsider a decision, or get a second opinion from another agent.
- Specialization: aAgents with expertise make decisions only within their domain.
- Menu of possible actions: An agent can perform only a finite set of specific actions.
Human in the Loop
Having human involvement may seem like it contradicts the notion that an agent is an independent actor, but in practice, it is often a reliable way to decrease risk without greatly decreasing workflow efficiency. If the agent can identify when it is most uncertain, it can request that a human review the situation and give input or make a decision, allowing the agent to continue in relative certainty and safety. If the agent is good at identifying the uncertain cases that require human help, and if these situations are not very common, then the whole process becomes only marginally less efficient, with the agent still doing the vast majority of the work.
Having a human in the loop is particularly helpful when there is uncertainty in goals or uncertainty in reasoning. In many ways, humans are still better than AI at navigating nuances in language, intent and reasoning.
In some industries, this approach is already standard practice. Since well before modern AI systems, for instance, banks and other industries have used automated phone answering systems to provide a menu of items that the caller can select, including a request to speak with an employee of the company. AI agents can take this type of approach to the next level, allowing far more complex and interactive cases to be automated before human involvement is required.
For example, an AI agent for personal banking might manage routine queries from users, but could request human review for complex or critical tasks involving external transfers, large sums of money or payments to new recipients. Ideally, such an AI agent should recognize when uncertainty is high and refer the task to a human rather than making a potentially costly mistake.
NVIDIA has published a thorough article on building a human-in-the-loop AI agent for social media content.
High Awareness of Uncertainty
AI agents operate best when they are aware of their own confidence levels and can adjust their behavior accordingly. When an agent encounters tasks with mixed certainty levels — where some aspects of a decision are highly confident while others are more ambiguous — it should take additional steps to reduce uncertainty or escalate the task to a human for review.
The strategy of maintaining awareness of uncertainty is particularly useful when there is uncertainty in resources or uncertainty in unknowable information. In these cases, we should design the agent to continually ask itself (or other systems) if it has enough information to make a confident decision. When there are signs that information is unreliable or missing, the agent can take action to gather what is needed.
For example, an AI agent designed for legal analysis might use a scoring system to evaluate the certainty of its responses. If a contract analysis tool determines that a particular clause matches a known precedent with high confidence, it can proceed without intervention. However, if the tool is unsure whether a clause introduces new legal risks, it can highlight it for further review rather than presenting a potentially incorrect interpretation as fact.
There’s ongoing academic research around this strategy, with some specific implementations called “uncertainty quantification” and “uncertainty-guided planning.”
Secondary Checking
One of the paradoxes of large language models (LLMs) is that while they sometimes generate incorrect or hallucinated responses, they can also correct themselves when prompted to verify their own output. A secondary checking mechanism — whether self-verification or cross-checking with another model — can significantly reduce uncertainty in AI-generated responses.
This strategy is particularly effective when dealing with uncertainty in resources or uncertainty in reasoning. With these types of uncertainty, the agent can review the available resources or its own reasoning process a second time, with additional scrutiny or focus as needed, to confirm that the subsequent conclusions hold up.
For example, an AI agent for customer support using an LLM might generate an initial response based on a user inquiry. Before presenting the answer, the agent could prompt itself to double-check the response against its documentation using a retrieval-augmented generation (RAG) system. If discrepancies are detected, the response could be refined or flagged for human review.
LangChain has a nice introduction to some popular implementations of “reflection agents,” which are examples of this strategy.
Specialization
A reliable way to prevent AI agents from making poor decisions is to ensure they operate strictly within their intended domain. An agent designed for one purpose shouldn’t be making decisions in areas where it lacks expertise or relevant context.
Specialization is particularly useful when there’s uncertainty in goals or uncertainty in reasoning. If the goals or the reasoning process don’t seem to be familiar and aligned with the specialization of the agentic workflow, the agent can refuse or defer action until a more clearly appropriate situation appears.
For example, a personal assistant AI focused on scheduling and calendar management should not be making decisions about email content beyond scheduling-related communication. Without proper restrictions, a scheduling AI might attempt to respond to emails outside its scope or make commitments it shouldn’t be authorized to handle. Guardrails can be implemented through well-structured prompt templates, explicit model instructions or additional verification layers that prevent the agentfrom stepping outside its designed function.
Specialization is typically achieved through some combination of prompt engineering, fine-tuning or model augmentations like retrieval-augmented generation (RAG), and it is also an ongoing area of academic research, such as this paper on guardrails and off-topic prompt detection.
Menu of Possible Actions
Instead of allowing an AI agent to freely decide how to use tools and APIs in an open-ended manner, restricting it to a set of well-defined actions can improve reliability and safety. This is especially useful when tools and APIs are complex or when improper usage could lead to unintended outcomes.
Similarly to the specialization strategy above, the agent is restricted in some way, but instead of limiting itself to a particular domain of inputs or expertise, here an agent limits its possible outputs or actions. These limits can be like intelligent guardrails that are as restrictive as necessary to balance effectiveness and safety.
A menu of possible actions is a strategy that is particularly effective when there is uncertainty in goals or uncertainty in resources. A menu of actions can act like a filter where, whatever inputs and reasoning processes have already been considered, the final step is to confirm that the action to be taken is well-defined and familiar and that the agent has all of the necessary inputs and resources to take action with that particular menu item.
For example, in software development, AI coding assistants can be restricted to modifying specific file types or suggesting changes, rather than writing or executing code autonomously. Or, in health care, AI agents assisting in diagnostics might be limited to providing reference data and symptom-matching rather than making treatment decisions outright.
Uncertainty in the Future of Agentic Workflows
With agents as well as with humans, uncertainty is a natural part of solving real-world problems. Rather than jumping straight to the “best” known action, agents need to be aware of the level of confidence in their goals, resources, unknowns and reasoning processes — and whenever high uncertainty is present, they should take action to reduce it whenever possible. Or, it is often a good decision to take no action until the situation changes and the level of uncertainty decreases. In either case, the agent needs to be able to recognize and acknowledge the uncertainty to address it in any way.
Agentic systems and workflows should embrace the concept of uncertainty at the design level, even as they attempt to reduce it. Ignoring uncertainty leads to overconfidence, bad decisions and actions with negative consequences. Designing agentic systems to be aware of uncertainty and to handle it well will continue to grow in importance as we allow our agents to do an increasing number of tasks autonomously on our behalf.
We may not know what agents will be able to do in one, five, or 10 years, but we know they will continue to routinely encounter uncertainty in goals, resources, unknowns and reasoning. And AI agents will need to admit when they don’t know enough so they can stop themselves before making bad choices that lead to adverse outcomes.
You can get started building AI agents with Langflow by following this guide.
The post AI Agents in Doubt: Reducing Uncertainty in Agentic Workflows appeared first on The New Stack.