Why One AI Agent Is Not Enough When You Are Talking to Your Data

BLOG

Jun 4

By Yigit Unver

Most AI products that let you chat with your data are built the exact same way. You take a single large language model, connect it to your database and ask it a question.

It works beautifully in a carefully staged demo. It fails instantly in production.

I learned this the hard way over the past few months while building Chat with your Analyst. This is the AssistYou product feature that lets you query your voice agent analytics using plain language.

This article is the honest story of what went wrong with our first version, why a single AI agent simply cannot do the job and the architectural decision that finally fixed it.

The Single Agent Trap

When I built the first version of this feature, I relied on a single agent. One large language model was given the user's question, the relevant data and a prompt detailing how to behave.

Almost immediately, I hit a massive engineering wall. Two major failure modes appeared.

The first was hallucination. When the model was asked to produce numbers it could not directly compute, it sometimes invented plausible looking values. If a user asked for a call resolution rate, the model would confidently spit out a number that looked correct but had absolutely no basis in reality. In an enterprise analytics product, this is catastrophic. You cannot make business decisions on fake numbers.

The second failure was the constant trade-off between quantitative and qualitative performance. When I tuned the system to maximise mathematical accuracy on numerical questions, the quality of the narrative summaries degraded. When I tuned the model to write better narrative summaries, the numbers started drifting.

I was asking the exact same model to do two jobs that pull in completely opposite directions. Getting better at one made it worse at the other.

Divide and Conquer

The solution to this problem is older than artificial intelligence itself. It is the core principle that has guided software engineering for forty years. When one component is being asked to do too much, you split it.

I completely rebuilt Chat with your Analyst around a multi-agent architecture. Instead of a single model trying to handle every request, the system now uses a highly coordinated team of specialised agents.

An orchestrator manages the user, and a team of sub-agents does the heavy lifting.

How the Orchestrator Actually Works

The orchestrator is the only agent that actually talks to you. When you send a message, it executes a strict process to ensure total accuracy.

First, it interprets your intent and runs a clarifying loop. If your prompt is vague, the orchestrator does not guess. It outright refuses to invent an answer and instead asks you for clarification. This strict boundary prevents garbage data from ever entering the system.

Second, it builds a plan and routes the work in parallel. Based on your intent, the orchestrator triggers the appropriate sub-agents simultaneously. If you ask for numerical data and a written summary, the system does not wait for one to finish before starting the other. The agents run in parallel.

Third, it composes the response. The sub-agents return their results as structured internal messages. The orchestrator reviews the data and seamlessly assembles the final output.

You experience a single natural conversation. Underneath, an entire team of specialists just went to work for you.

Meet the Sub-Agents

Each sub-agent inside Chat with your Analyst is rigorously tested and built around one specific analytical capability.

The Analyst Agent This sub-agent handles your quantitative requests. It builds the logic to fetch the correct data and computes the hard numbers. It returns exact matching records. It does not invent values and it never approximates.

The Summariser Agent This sub-agent handles narrative extraction. It reads the relevant content and spots themes across thousands of calls. It is tuned purely for qualitative depth.

The Investigator Agent This is our ultimate anti-hallucination weapon. This sub-agent dives into the raw transcripts and pulls exact quotes to provide verifiable evidence. It proves to you that the data is real.

The entire architecture is also multilingual by design. You can ask a question in English about a database of Dutch customer conversations and the agents will seamlessly process and translate the insights.

Why This Fixes the Data Trade-Off

The original problem was that one model could not be excellent at both math and storytelling. The multi-agent architecture solves this by no longer asking it to try.

The Analyst agent never has to write a narrative. The Summariser agent never has to compute a formula. The trade-off did not disappear because the underlying AI models magically got better. It disappeared because I changed the fundamental software architecture.

We stopped asking a generalist to be a specialist.

Where We Are Going Next

The roadmap for Chat with your Analyst extends in two very exciting directions.

The first is scheduled reporting. You will soon be able to ask the analyst a question, verify the answer and then schedule that exact analysis to run every Monday morning and drop directly into your inbox.

The longer term direction is total convergence with your Flow Builder. A voice agent flow generates analytics. The analytics describe exactly what is working and what is failing. A natural next step is for the analyst to suggest direct changes to the flow itself. A node where users frequently drop off could be flagged with a recommended redesign. Tested and deployed.

The boundary between analysing your operations and actively improving them is going to disappear completely.

Architecture Always Beats the Model

It is tempting to assume that as language models get bigger, these kinds of structural decisions will become unnecessary. People assume a massive model will simply handle every kind of question perfectly.

This completely misses the point.

A bigger model still cannot be optimised for two contradictory objectives at the same time. A bigger model still produces much better results when the work is decomposed into steps that can be reasoned about, tested and improved independently.

Multi-agent orchestration is not a temporary workaround for current model limitations. It is how reliable, enterprise-grade systems are built. Software engineers learned this decades ago. AI engineers are relearning it right now.

Frequently Asked Questions

What is a multi-agent architecture in AI products? A design pattern where multiple specialised AI agents work together under a coordinator rather than a single agent handling every task. Each agent is optimised for a specific job which dramatically improves reliability.

What is an orchestrator agent? The agent that sits between the user and the specialised sub-agents. It interprets intent, runs clarifying loops to prevent bad inputs and triggers the correct sub-agents to run in parallel.

Why is one AI agent not enough for analytics? Quantitative analysis requires strict mathematical precision. Qualitative analysis requires nuance and narrative depth. Splitting the work across specialised sub-agents completely removes the trade-off between the two.

How does this prevent hallucinations? Multi-agent architecture gives each agent a narrow job that is harder to get wrong. Furthermore, the orchestrator refuses vague inputs and the Investigator sub-agent provides verifiable evidence by surfacing exact quotes from raw conversation logs.

What is Chat with your Analyst? The AssistYou product feature that lets users query their AI voice agent analytics data through natural language. Instead of building dashboards, users ask questions directly and receive answers grounded in their actual operational data.