Back to blog
5 min read

Upstreaming with AI

How we contributed 17 upstream PRs in 90 days—where AI accelerated our workflow, what we learned, and practical tips for open source success.

Listen to this article (Gen-AI)0:00
0:00
Share onX

Over the last ~90 days we opened 17 upstream PRs across DuckDB, Trino, ClickHouse, DataFusion, dbt, OpenTelemetry, and MCP tooling—in Rust, Ruby, Python, C++, SQL, and Helm/Kubernetes. AI helped with orientation, spec inference, and scaffolding; humans made design calls and ensured correctness. Links and diffs below; practical notes at the end: tests first, small diffs, disclose AI.

By “upstreaming,” we mean contributing code, tests, and docs back to the original projects we depend on so improvements land at the source and flow downstream to everyone.

Why upstreaming is hard

In practice, here are the frictions you hit before the first line of code lands upstream:

  • Architecture and invariants. Mature projects encode years of decisions: performance budgets, async/threading models, memory ownership, error semantics, and backward-compat promises. Those contracts aren’t in one file.
  • Codebase shape and conventions. Module boundaries, naming, feature flags, build targets, and release workflows vary widely—especially across monorepos and plugin ecosystems.
  • Language and tooling drift. You’ll often touch stacks you don’t use daily (C++, Python, Helm). Editions change, linters disagree, and safety fences differ.
  • Local repro and datasets. Getting a faithful environment, fixtures, or golden files is nontrivial; cross-platform quirks hide bugs.
  • Tests and CI. Matrix builds, flaky integrations, and golden snapshots fail in surprising ways. Knowing what to update vs. what to investigate is a skill.
  • Review process. PR/MR templates, labels, changelogs, and release notes all matter—and maintainers’ time is scarce.
  • Docs gap. READMEs drift; the best knowledge often lives in issues, RFCs, and commit history.

This isn’t a gripe—it’s the reality that gives open source its edge: hard-won architecture contracts and invariants, divergent codebase conventions, cross-language/tooling drift, the grind of reproducible environments and datasets, tests and CI that bite, scarce reviewer bandwidth, and docs that lag. In that context you’re working with real constraints, serving real users, and making deliberate tradeoffs.

Where AI helped (and where it didn't)

Here’s how we use AI to reduce friction, while keeping engineers in the driver’s seat:

  • Error to entry point. From an error code, ask Cursor where to look first.
  • Stack trace triage. From a GDB stack trace, propose likely root cause or where to instrument.
  • Behavior to diff. From an example of new behavior, identify where to apply the patch with minimal blast radius.
  • Plugin by analogy. From another plugin, outline how to create a new one to solve X or Y.

Humans still own approach, correctness, and tradeoffs. AI compresses time to context and draft time, and it doesn’t make design decisions for us.

Where it didn't

Here are a few places where AI fell short or required careful human oversight in our workflow:

  • Test deletion over fixes. In an early experimentation rewriting lkml from Python to Ruby, AI agents "made CI green" by removing failing specs instead of repairing parser/serializer semantics. The original library: joshtemple/lkml. Our Ruby experimentation: altertable-ai/lkml.
  • Hallucinations without grounding. We saw frequent invented APIs/flags in unfamiliar stacks. Plugging the exact docs/spec page into context (so the AI can use RAG) materially reduced hallucinations and improved first-try correctness.

What we shipped

Here's a breakdown of our contributions across different projects and technologies, organized by category.

DuckDB and extensions

Trino clients

Modeling and parsing

Infrastructure

Observability

AI

What to keep in mind

  • Quality matters most. The feedback that counts is “tests pass, edge cases covered.” AI can compress time-to-context, but correctness and clarity remain human work.
  • Prefer small diffs. Keep the blast radius bounded; land surgical changes with good tests and context.
  • Cross-language lift. Moving between Ruby, Rust, C++, and Helm is feasible when AI translates conventions and you own the decisions. It saves time on syntax and tooling, leaving room for real design choices.
  • Be transparent. Disclosure helps maintainers calibrate reviews and builds trust around provenance. In August, Mitchell Hashimoto introduced a simple rule: if AI helped you make a PR, say so. We adopt that in practice: a short “AI assistance” note in PRs (what was generated, what was reviewed by a human, links to prompts if useful).
  • Upstream early. Small, surgical fixes compound across every downstream that pulls them. If we hit a papercut or spot an opportunity to unstick others, we upstream the fix. It’s faster long-term and makes the ecosystem better.

To everyone who reviewed, merged, or debated with us: THANK YOU. The stack is better because you care.

Share onX
Sylvain Utard, Co-Founder & CEO at Altertable

Sylvain Utard

Co-Founder & CEO

Seasoned leader in B2B SaaS and B2C. Scaled 100+ teams at Algolia (1st hire) & Sorare. Passionate about data, performance and productivity.

We're hiring! Join our team.View All Jobs
Altertable Logo Shard
About Altertable
We're building a unified, AI-driven data platform that puts data to work for people.
Craft with Purpose
Focus with Ownership
Operate with Transparency
Grow with Others