Over the last ~90 days we opened 17 upstream PRs across DuckDB, Trino, ClickHouse, DataFusion, dbt, OpenTelemetry, and MCP tooling—in Rust, Ruby, Python, C++, SQL, and Helm/Kubernetes. AI helped with orientation, spec inference, and scaffolding; humans made design calls and ensured correctness. Links and diffs below; practical notes at the end: tests first, small diffs, disclose AI.
By “upstreaming,” we mean contributing code, tests, and docs back to the original projects we depend on so improvements land at the source and flow downstream to everyone.
Why upstreaming is hard
In practice, here are the frictions you hit before the first line of code lands upstream:
- Architecture and invariants. Mature projects encode years of decisions: performance budgets, async/threading models, memory ownership, error semantics, and backward-compat promises. Those contracts aren’t in one file.
- Codebase shape and conventions. Module boundaries, naming, feature flags, build targets, and release workflows vary widely—especially across monorepos and plugin ecosystems.
- Language and tooling drift. You’ll often touch stacks you don’t use daily (C++, Python, Helm). Editions change, linters disagree, and safety fences differ.
- Local repro and datasets. Getting a faithful environment, fixtures, or golden files is nontrivial; cross-platform quirks hide bugs.
- Tests and CI. Matrix builds, flaky integrations, and golden snapshots fail in surprising ways. Knowing what to update vs. what to investigate is a skill.
- Review process. PR/MR templates, labels, changelogs, and release notes all matter—and maintainers’ time is scarce.
- Docs gap. READMEs drift; the best knowledge often lives in issues, RFCs, and commit history.
This isn’t a gripe—it’s the reality that gives open source its edge: hard-won architecture contracts and invariants, divergent codebase conventions, cross-language/tooling drift, the grind of reproducible environments and datasets, tests and CI that bite, scarce reviewer bandwidth, and docs that lag. In that context you’re working with real constraints, serving real users, and making deliberate tradeoffs.
Where AI helped (and where it didn't)
Here’s how we use AI to reduce friction, while keeping engineers in the driver’s seat:
- Error to entry point. From an error code, ask Cursor where to look first.
- Stack trace triage. From a GDB stack trace, propose likely root cause or where to instrument.
- Behavior to diff. From an example of new behavior, identify where to apply the patch with minimal blast radius.
- Plugin by analogy. From another plugin, outline how to create a new one to solve X or Y.
Humans still own approach, correctness, and tradeoffs. AI compresses time to context and draft time, and it doesn’t make design decisions for us.
Where it didn't
Here are a few places where AI fell short or required careful human oversight in our workflow:
- Test deletion over fixes. In an early experimentation rewriting
lkml
from Python to Ruby, AI agents "made CI green" by removing failing specs instead of repairing parser/serializer semantics. The original library: joshtemple/lkml. Our Ruby experimentation: altertable-ai/lkml. - Hallucinations without grounding. We saw frequent invented APIs/flags in unfamiliar stacks. Plugging the exact docs/spec page into context (so the AI can use RAG) materially reduced hallucinations and improved first-try correctness.
What we shipped
Here's a breakdown of our contributions across different projects and technologies, organized by category.
DuckDB and extensions
-
HTTPFS: S3 SSE-C support, with sensitive headers redacted in logs.
-
INET: ANSI-friendly function aliases for IP range operators.
-
NetQuack: Implemented
URLPathHierarchy(url)
for URL bucketing. -
Read-cache FS: Fixed pointer lifecycle bug that could lead to use-after-free.
-
DuckLake (discussion): Schema-qualified support and API ergonomics — see issue #334 — and related DuckDB core discussion issue #18439.
-
Read-cache FS (discussion): Design and lifecycle details captured in issue #219.
Trino clients
- Ruby: Switched to persistent HTTP adapter for better throughput.
- Rust: Added query cancellation so work stops when users do.
Modeling and parsing
- dbt scaffold: Fixed broken dev dependency path after repo split.
- DataFusion parser: Added
CREATE SCHEMA ... WITH
support.
Infrastructure
-
ClickHouse Helm chart:
- Exposed
persistence.volumeName
for smoother disk migrations. - Added
CHOWN
capability for init container volume permissions.
- Exposed
-
Hetzner Kubernetes/Terraform: Infer
cluster_dns_ipv4
fromservice_ipv4_cidr
to keep CoreDNS IP valid with custom service CIDRs.
Observability
- OpenTelemetry Ruby / GraphQL: Handled
GraphQL::Query::Partial
so spans survive with streaming directives like@stream
.
AI
- fast-mcp:
- Rewrote Dry::Schema → JSON Schema compiler.
- Added
on_error_result
hook for error reporting.
- ruby_llm:
- Added
Responses
client path for non-chat models and documented required tools. - Made tool-calling robust for tools with no parameters.
- Discussed temperature semantics; closed in favor of upstream fix.
- Added
- MCP Inspector: Added enum support for constrained tool choices.
What to keep in mind
- Quality matters most. The feedback that counts is “tests pass, edge cases covered.” AI can compress time-to-context, but correctness and clarity remain human work.
- Prefer small diffs. Keep the blast radius bounded; land surgical changes with good tests and context.
- Cross-language lift. Moving between Ruby, Rust, C++, and Helm is feasible when AI translates conventions and you own the decisions. It saves time on syntax and tooling, leaving room for real design choices.
- Be transparent. Disclosure helps maintainers calibrate reviews and builds trust around provenance. In August, Mitchell Hashimoto introduced a simple rule: if AI helped you make a PR, say so. We adopt that in practice: a short “AI assistance” note in PRs (what was generated, what was reviewed by a human, links to prompts if useful).
- Upstream early. Small, surgical fixes compound across every downstream that pulls them. If we hit a papercut or spot an opportunity to unstick others, we upstream the fix. It’s faster long-term and makes the ecosystem better.
To everyone who reviewed, merged, or debated with us: THANK YOU. The stack is better because you care.