Blog/Article

Your AI Agent's Queries Run Fine. That Could Be A Problem.

June 16, 2026 · 7 min read

There's a particular kind of failure that doesn't announce itself. The query parses. It executes. It returns a clean table of numbers, formatted and ready to drop into a slide. Nobody sees a stack trace. The agent reports success, the dashboard renders, and the number is wrong.

This is the failure mode that should keep data leaders up at night, and it's the one we talk about least. We've spent years worried about whether language models can write valid SQL. That problem is largely solved — modern agents generate syntactically correct queries with high reliability. The unsolved problem is whether the query means what the agent thinks it means. A syntax error is loud, visible, and recoverable. A semantic error is silent, plausible, and load-bearing on someone's decision.

The instinct is to treat this as a model-capability problem — if the agent were smarter, it would get the join right. It wouldn't. The information needed to write a correct query usually isn't in the schema at all. It lives in the business: in the grain a table is stored at, in which of four date columns counts as "the order date," in the rule that defines an active customer this quarter. An agent looking at column names is reconstructing that meaning by inference, every single time, and inference fails quietly. This is a governance problem wearing a model-capability costume.

A semantic layer is how you stop inferring and start declaring. Here are the six families of error it exists to prevent.

1. Grain: combining numbers that don't live at the same level

Every measure is stored at some level of detail — a grain. Order revenue lives at the order line. An inventory snapshot lives at one row per product per day. When an agent joins the two and rolls both up by month, the SQL is flawless and the result is inflated, because it's multiplying a daily snapshot across every order that touched that product.

The same blind spot produces semi-additive errors. An account balance can be summed across accounts but not across time — summing twelve monthly balances doesn't give you an annual balance, it gives you nonsense. Snapshot-versus-transaction confusion is the same disease: summing a daily count of active customers across a month treats a state measurement as if it were a stream of events.

A semantic layer declares each entity's grain and encodes additive behavior explicitly — sum, last value, period-end, average snapshot, non-additive — so the unsafe combination is rewritten or refused before it runs.

2. Cardinality: joins that multiply rows

Fan and chasm traps are the famous ones, but they're a single instance of a broader class. Any non-unique join path causes rows to fan out: customers belonging to multiple segments, products tagged with multiple categories, employees assigned to multiple departments. The agent joins through a bridge, every fact row gets duplicated, and the totals quietly double or triple.

Join-filter leakage is the subtle cousin. Filter returns to one product category, and because returns and sales were joined before aggregation, you've silently dropped every sale that had no matching return. The population changed underneath the metric and nothing flagged it.

A semantic layer carries cardinality metadata and governs bridge relationships — applying allocation, weighting, or distinct aggregation, and aggregating facts independently before they're combined rather than after.

3. Aggregation behavior: the right inputs, the wrong math

Some of the most confident wrong answers come from averaging an average or summing a ratio. An agent asked for margin will happily write:

SUM(margin_percent)

when the only correct formulation is:

SUM(profit) / SUM(revenue)

Ratios, rates, and conversion metrics never sum. Neither do pre-computed averages: take the average of each store's average transaction value and you've given a corner store the same weight as a flagship, because the math no longer knows the transaction volumes underneath. Distinct counts have their own trap — count customers on the wrong identifier, or add distinct counts across groups, and the number is invalid in a way no error message will catch.

Governed measures preserve the correct aggregation formula and define weighted averages from their underlying numerator and denominator, so the metric computes the same way no matter who or what assembled the query.

4. Temporal semantics: as-was versus as-is

Time is where agents are most fluently wrong. The classic case is slowly changing dimensions: an agent joins a two-year-old order to the current customer record — current territory, current segment, current rep — instead of the version that was valid when the order was placed. The report attributes history to a present that didn't exist yet.

Layered on top of that are the perennials: UTC mixed with local time, calendar months standing in for fiscal periods, week boundaries and daylight-saving transitions handled inconsistently from one query to the next.

A semantic layer supplies canonical date dimensions, fiscal hierarchies, and reporting time zones, and encodes effective-date joins that distinguish "as was" reporting from "as is."

5. Population definition: who counts

"Active customer." "Qualified lead." "Retained account." Each of these is a business rule spanning dates, statuses, exclusions, and thresholds — and an ungoverned agent reconstructs it slightly differently every time it's asked. One query counts anyone with a login in 90 days; the next counts anyone with a non-cancelled subscription; both call the result "active customers" and the two numbers never reconcile.

This is the difference between defining a metric and operationalizing meaning. A semantic layer makes the population itself a governed object — a named filter, entity, or measure definition — so the answer to "how many active customers" is the same answer every time, by construction.

6. Relationship meaning: technically valid, semantically wrong

A fact table often references the same dimension several ways. An order has a created date, a shipped date, a cancelled date, and a payment date. Treated as interchangeable foreign keys, an agent filters or groups on whichever one it grabbed first. A salesperson can relate to an order as creator, owner, closer, or account rep — four valid join paths, four different meanings, and no syntactic signal telling them apart.

The same ambiguity shows up as non-conformed dimensionscustomer_id, billing_customer_id, and account_id look joinable but don't share a business meaning — and as double-applied logic, where the agent re-derives net revenue from a column that already holds net revenue.

A semantic layer names relationships by role, exposes only approved paths, identifies conformed dimensions, and hides implementation columns from agent-facing metadata. The wrong route stops being reachable.

The point: a metric dictionary is not enough

It's tempting to think the answer is a list of blessed metric names that the agent picks from. It isn't. Almost none of these failures are about which metric — they're about grain, cardinality, aggregation behavior, temporal validity, population, and relationship meaning. Hand an agent a dictionary of metric names and you've solved the easiest 10% of the problem.

To actually protect agents, a semantic layer has to encode the meaning underneath the names:

  • entity grain and join cardinality
  • valid, role-named relationship paths
  • additive and semi-additive behavior
  • temporal validity and effective-date logic
  • canonical identifiers and distinct-count keys
  • units, currencies, scale, and conversion
  • governed filters and population definitions
  • query-validation rules that reject the unsafe combination

That is the line between a metric catalog and a query-safety system. The first labels the data. The second governs what an agent is allowed to do with it.

Why this matters now

You can give an agent a better model and it will write more elegant queries against the same missing context — and produce wrong answers faster. The constraint on agentic analytics was never the model's fluency in SQL. It's the semantic context the model has to work with. A universal semantic layer supplies that context as a governed, machine-readable contract: the same meaning, enforced the same way, whether the query comes from a person, a notebook, or an autonomous agent.

The agents are already in production writing queries against your warehouse. The only open question is whether the meaning they're operating on is governed or guessed.


Coginiti's universal semantic layer encodes grain, cardinality, temporal semantics, and population logic as governed metadata, so the same definitions protect every consumer of your data — human or agent.

See Semantic Intelligence in Action

Coginiti operationalizes business meaning across your entire data estate.