Introduction to Altertable Lakehouse
At the heart of Altertable is a set of Workers that run your queries inside preconfigured SQL sessions. Workers run on Altertable-managed infrastructure by default, or inside your own cloud.
╔════════════════ Altertable ════════════════════╗ ╔════════════╗ ║ ║░ ║ ║░ ║ ┏━━━━━━━━━━━━━━━━━━┓ ┏━━━━━━━━━━━━━━━┓ ║░ ║ Client ║░───HTTP──▶║ ┃ Workers ┃ ┃ Distributed ┃ ║░ ║ ║░ ║ ┃ ┌──┐ ┌──┐ ┌──┐ ┃ ┃ Storage (R2) ┃ ║░ ║ ║░──Arrow──▶║ ┃ │W1│ │W2│ │WN│ ┃───▶┃ ┌─────────┐ ┃ ║░ ║ ║░ Flight ║ ┃ └──┘ └──┘ └──┘ ┃ ┃ │ Parquet │ ┃ ║░ ║ ║░ ║ ┃ ┃ ┃ │ File 1 │ ┃ ║░ ║ ║░───PG───▶ ║ ┗━━━━━━━━┬━━━━━━━━━┛ ┃ └─────────┘ ┃ ║░ ╚════════════╝░ Adapter ║ │ ┃ ┃ ┌─────────┐ ┃ ║░ ░░░░░░░░░░░░ ║ ┏━━━━━━━┻━━━━━━┓ ┃ ┃ │ Parquet │ ┃ ║░ ║ ┃ Local SSD ┃ ┃ ┃ │ File... │ ┃ ║░ ║ ┃ Cache ┃ ┃ ┃ └─────────┘ ┃ ║░ ║ ┗━━━━━━━━━━━━━━┛ ┃ ┃ ┌─────────┐ ┃ ║░ ║ ┃ ┃ │ Parquet │ ┃ ║░ ║ ┃ ┃ │ File N │ ┃ ║░ ║ ┃ ┃ └─────────┘ ┃ ║░ ║ ┃ ┗━━━━━━━━━━━━━━━┛ ║░ ║ ┃ ║░ ╚════════════════════════════════════════════════╝░ ░░░░░░░░░░░░░░░░░░░░░┃░░░░░░░░░░░░░░░░░░░░░░░░░ ┃ ┃ ┏━━━━━━━━━━━━━━━┓ ┃ ┃ |░ ┃ ┃ External |░ ┗───▶┃ DB & Warehouse|░ ┃ |░ ┗━━━━━━━━━━━━━━━┛░ ░░░░░░░░░░░░░░░░
Learn the fundamentals: Read about Insights and Ask AI to understand how the lakehouse fits into the broader data runtime.
The lakehouse encompasses all data catalogs in your environment. It gives storage, query, and business context a shared foundation instead of splitting each workload into a separate system. The main catalog categories are:
- Altertable catalogs: managed data stores you create in Altertable and back with a selected bucket.
- External catalogs: your existing databases and warehouses connected to Altertable for federated queries.
If you enable Product Analytics for an environment, Altertable also provisions a read-only managed catalog named product_analytics. Altertable writes product events, identities, and derived web analytics tables into that catalog automatically, so behavioral data can be joined with revenue, operations, and other business context.
If you enable OpenTelemetry, Altertable provisions a managed opentelemetry catalog for OTLP logs and traces. Those records become normal SQL tables, so operational signals can be joined with product events, customer data, and application state.
Buckets provide the storage layer for managed catalogs and bucket-backed external catalogs such as Bucket Tables and Iceberg Tables. Every environment includes a built-in bucket, and you can connect additional buckets from Cloudflare R2, Amazon S3, Google Cloud Storage, or S3-compatible providers when you need to control where files live.
You may create as many catalogs as you want, for instance:
product_analytics (built-in product analytics catalog (enabled per environment) ) | |
| Table | Description |
| Raw events (everything /track receives) |
| Raw identities (everything /identify receives) |
| Identity-resolved events for behavioral analysis |
| Alias and anonymous-resolved identities |
| Aggregated session data |
| Page-level analytics |
opentelemetry (built-in observability catalog (enabled per environment) ) | |
| Table | Description |
| Trace spans received through OTLP |
| Log records received through OTLP |
acme_lakehouse (your Altertable catalog ) | |
| Table | Description |
| Example of raw table copied by your ETL |
| Example of raw table copied by your ETL |
| Example of raw table copied by your ETL |
| Example of raw table copied by your ETL |
| Example of analytics table generated by DBT |
| Example of analytics table generated by DBT |
| Example of analytics table generated by DBT |
| Example of analytics table generated by DBT |
acme_pg (external catalog (Postgres) )Production database read replica | |
| Table | Description |
| User accounts and profiles |
| Customer orders |
| Product catalog |
| Customer subscriptions |
reporting_duckdb (external catalog (DuckDB file in a connected bucket) ) | |
| Table | Description |
| Revenue model exported by another process |
| Feature adoption snapshot |