JULY 1, 2025

6 MIN READ

SYLVAIN UTARD

Semantic Layers in 2025

Semantic Layers in 2025

We tested four semantic layer approaches: Looker, Omni, Cube.dev, and dbt MetricFlow. Here's what we learned about each.

0:00
4:03

Gen-AI Audio

Share

Blog

What is a semantic layer and why should you care?

Picture the raw tables in your data warehouse: cryptic column names (like user_id, plan_code, ts_event) and a gauntlet of SQL joins needed just to answer “What was weekly retention for Pro customers?”. Without a semantic layer, you often tackle this by writing complex SQL or creating dedicated summary tables (essentially baking business logic into the data's physical state). A semantic layer, by contrast, is a thin, declarative map that turns those columns into high-level business concepts:

  • Typed & formatted dimensions: for example, created_at is recognized as a timestamp; charts automatically pick appropriate time grains and calendar settings, and label axes accordingly.
  • Reusable metrics: for example, total_revenue is defined once (as a sum in USD), and every analysis or dashboard uses that single source of truth for revenue.
  • Explicit relationships: for example, ordersusers is defined as a relationship, so queries can automatically generate the correct JOINs every time.

The payoff: you (and AI agents) can explore data at a conceptual level. No more repetitive SQL; you get cleaner lineage and faster, more trustworthy insights.

Example

For example, below is a toy semantic model definition (in a LookML-style YAML syntax) illustrating one dimension and one measure:

dimension: signup_date {
sql: ${TABLE}.created_at ;;
type: time
timeframes: [day, week, month]
}
measure: total_revenue {
sql: ${TABLE}.amount ;;
type: sum
value_format_name: usd
}

With that defined, any tool that understands the semantic layer can instantly plot Total Revenue by Signup Month with axes correctly labeled, currency formatted, and drill-down options intact.

Why AI-Native Agents Need a Semantic Layer

When we talk about AI agents in data, we're really talking about small pieces of software that read, reason, and act on our behalf. They can spot patterns at lightspeed, but only if they can trust the meaning of every table, metric, and event they touch. That trust comes from a shared semantic layer sitting between raw storage and every agent-powered workflow.

What the semantic layer gives youWhy agents need it
One canonical vocabulary. Every metric (LTV), entity (workspace vs. org), and relationship lives in a single, version-controlled catalog.Prevents agents from inventing slightly different definitions that break cross-team comparisons.
Reusable business logic. Transformations such as “net revenue” or cohort rules are written once and inherited everywhere.Lets agents chain reasoning steps without re-implementing SQL or asking humans to reconcile logic each time.
Context & lineage. Each column or measure carries rich metadata: owner, source system, calculation history, privacy flags.Gives agents the backdrop they need to explain why a recommendation is safe, compliant, and interpretable.

Without a semantic layer, AI agents are clever interns rifling through a chaotic file cabinet; with one, they’re seasoned analysts working from a living playbook. If we want insights to find us before we ask, we have to give the agents a language and that language is the semantic layer.

Four approaches, hands-on

Below we summarize our hands-on experience with four different semantic layer solutions. We spun up each product on the same database to keep the comparison consistent. Here's how they fared:

Looker (Google Cloud)

Looker is a full-stack BI platform built around LookML, its Git-versioned modeling language for defining data models.

view: order_items {
sql_table_name: analytics.order_items ;;
dimension: order_id {
primary_key: yes
sql: ${TABLE}.order_id ;;
}
measure: total_revenue {
type: sum
sql: ${TABLE}.price ;;
value_format_name: usd
}
}

Why we liked it

  • Warehouse-native execution; no additional server needed (metrics are computed in your existing data warehouse).
  • One-stop shop: modeling, visualization, scheduling, and embedding are all available in one platform.
  • AI-assisted modeling & querying: Gemini-powered LookML and visualization assistants help build models and enable natural language queries (NLQ).

Where it strained

  • Proprietary LookML.
  • High cost for large teams (per-seat pricing can climb quickly as adoption spreads).
  • Limited reuse outside Looker.

Omni

Omni is a newer analytics workspace that blends a notebook-style SQL experience with a governed semantic layer.

fields:
- name: created_date
sql: ${TABLE}.created_at
data_type: date
- name: total_revenue
sql: ${TABLE}.price
aggregate_type: sum

Why we liked it

  • Rapid iteration: blend of spreadsheet & SQL in one UI, then round-trip definitions to dbt.
  • Tight dbt integration.
  • Built-in AI assistants that respect the defined layer.

Where it strained

  • Pricing not publicly disclosed. Expect an opaque SaaS subscription, on top of your warehouse usage.
  • Mixed learning curve. Some users appreciate the familiar Excel-style interface, others find the blend of workbooks and data models initially confusing.

Cube.dev

Cube.dev is an open-source, headless semantic layer that exposes metrics via REST, GraphQL, and SQL APIs, with optional pre-aggregations for speed.

cube(`Orders`, {
sql: `SELECT * FROM public.orders`,
measures: {
count: { type: `count` },
totalRevenue: { sql: `amount`, type: `sum` },
},
dimensions: {
id: { sql: `id`, type: `number`, primaryKey: true },
createdAt: { sql: `created_at`, type: `time` },
},
preAggregations: {
dailyRollup: {
type: `rollup`,
measures: [Orders.totalRevenue, Orders.count],
timeDimension: Orders.createdAt,
granularity: `day`,
},
},
});

Why we liked it

  • Truly headless, a single semantic model can serve many front-ends.
  • Fine-grained performance tuning (caching, pre-aggregations, etc.).
  • “Semantic-layer-sync” for existing BI tools.

Where it strained

  • Developer-centric learning curve.
  • When hosted, usage-based pricing relying on their own CCU (Cube Consumption Unit) model.
  • Steeper learning curve for designing effective pre-aggregations.

dbt + MetricFlow

The popular dbt transformation framework many teams already use now includes MetricFlow, allowing you to declare semantic models in YAML right alongside your SQL models in dbt.

semantic_models:
- name: order_items
defaults:
agg_time_dimension: created_at
entities:
- id: order_id
dimensions:
- name: created_at
type: time
measures:
- name: total_revenue
expr: price
agg: sum

Why we liked it

  • Lives in Git, metric definitions are version-controlled (with code reviews, CI, and environments) just like the rest of your dbt project.
  • Warehouse-native execution; no additional server needed (metrics are computed in your existing data warehouse).
  • Open-source version suits self-hosting.

Where it strained

  • Complex compiled SQL, large, multi-metric queries can explode into thousands-line SQL, taxing warehouses and making debugging tricky.
  • No built-in visualization.

How this shapes our roadmap

At Altertable we believe the future isn't another dashboard: it's an always-on insight engine. Semantic layers are a key ingredient, but only if they're:

  • Open & headless so insights travel everywhere users work.
  • AI-ready with rich types, units, and relations that agents can reason about.
  • Infra-lite because teams shouldn't babysit yet another server.

Each tool above nails part of that trifecta, but none check all three boxes yet which is why we're building, testing, and learning out loud.

Share

Sylvain Utard, Co-Founder & CEO at Altertable

Sylvain Utard

Co-Founder & CEO

Seasoned leader in B2B SaaS and B2C. Scaled 100+ teams at Algolia (1st hire) & Sorare. Passionate about data, performance and productivity.

Related Articles

Continue exploring topics related to this article

Stop Batching Analytics
DECEMBER 30TH, 2025
Sylvain Utard

Stop Batching Analytics

Analytics, Architecture, Performance

Why we're forcing analytics through complex batch pipelines when append-only data should work like logs. The warehouse constraint that stopped making sense.

READ ARTICLE
From Dashboards to Dialogue
JUNE 24TH, 2025
Sylvain Utard

From Dashboards to Dialogue

Altertable Agent, Product, Analytics

Most dashboards are never opened twice. They clutter stacks, lag behind questions, and bury insight in clicks. We're replacing dashboard sprawl with AI agents.

READ ARTICLE
Let Agents Render the Platform
MAY 25TH, 2026
Florian Valeye

Let Agents Render the Platform

Architecture, Altertable Agent, Engineering

Charts, queries, and interactive UI now render inside AI agents through MCP Apps—the same components as altertable.ai.

READ ARTICLE
Memory Is Not a Database
MAY 18TH, 2026
Florian Valeye

Memory Is Not a Database

Altertable Agent, Architecture, Engineering

Intelligence without memory is nothing. The right model is memory that lives, forgets, and knows where it belongs.

READ ARTICLE
Pruning Top-N Queries
FEBRUARY 3RD, 2026
Sylvain Utard

Pruning Top-N Queries

Open Source, Performance, Architecture

A deep dive into DuckLake PR #668 and how Top-N dynamic filter pruning turns ORDER BY + LIMIT from full scans into metadata-driven execution.

READ ARTICLE
Altertable Logo

A lakehouse your apps, BI, and agents share

DuckDB workers on open formats, federated SQL across your existing systems,
and an MCP server for agents — at flat monthly pricing.