Discussion on Hacker News.
In our first post, we argued that the software we build is brittle because it’s fragmented: assembled from components with incompatible models, wired together through complex, opaque, unverified interfaces. We also argued this isn’t inevitable — a high-level model general enough to span the domains of real applications could let us build coherent systems instead. We promised to say more about how. Here’s the shape of the answer.
Cambra is a new category of system, one that fuses the database and the programming language: it collapses the stack into a single programming model that works at the level of business logic and data rather than networks and operating systems. You implement a data-intensive application (serving, transactions, stream processing, analytics) as one program. Not a set of services delicately wired together, but a single artifact that the compiler can analyze, the runtime can execute, and a programmer, agent or human, can reason about at a high level of abstraction.
This post presents our vision for Cambra — one we’re in the process of implementing. Our prototype has helped us crystallize that vision to the point where we are ready to share it in public, but it is not yet realized. Read what follows as what we hope Cambra will become.
SWEs with Superpowers
Coherence gives tooling tremendous leverage over a system. We promised that verification would become tractable across the whole application, that the platform could take over optimizations like pushing a filter to the data or building an index, and that observability can be automated to an unprecedented degree. Cambra delivers on that. Even more important than those features is the point of highest leverage: correctness.
Correctness has two halves. The first is verification — confirming that an implementation meets a specification. When the application is one program, the compiler can check the whole of it at once, and the contract and type mismatches we catalogued in our first post become ordinary compile errors. We build on this idea by letting types carry logical constraints alongside the structure of data, like “account balance never goes negative”. In Cambra, assertions are something the compiler proves rather than something you hope a test suite covers sufficiently.
The second is validation — determining that the specification describes something you actually want. Verification can’t help here: a program can be provably correct against a spec that is subtly wrong. The only way to really be sure is to watch it run against a realistic workload. Today that means standing up staging environments, dark launches, shadow traffic, and incremental rollouts: bespoke infrastructure that’s expensive to build and quietly rots as the application changes out of band. This is where Cambra does something genuinely new. Cambra lets you branch your live application (not just the code, not just the data — the entire system), point real or simulated traffic at the branch, and watch how it behaves, before it ships. Cambra won’t tell you whether the spec is right. That judgment is yours. It just makes finding out cheap, safe, and fast.
Together, these two loops compound into something bigger: the ultimate platform for agentic software engineering. We’ve argued that AI doesn’t remove the need for coherence; this is the other side of that claim. An agent is only as effective as the feedback it gets, and only over the parts of a system it can see. A fragmented system starves it on both counts. A coherent system is legible end to end. Verification at build and realistic validation give an agent exactly the feedback it needs to close the gap between what it wrote and what you asked for. To top it off, this feedback allows the agent to converge much faster, consuming fewer tokens in the process.
In practice, this feels like having superpowers: you describe a feature or an optimization to an agent in conversation, then turn it loose. It implements the change, lets the compiler verify it, and branches the running system to validate the new version under a full-scale simulation of your real workload. What lands in your inbox is a pull request with a report attached: not just a diff, but evidence of how the change behaves under load, with real data, before a single real user touches it. Your ideas manifest reliably, freeing you to be creative, instead of forcing you to trudge through problems that have already been solved a million times before.
The Architecture in a Nutshell
Cambra presents as a programming language and distributed, durable runtime. The surface language has a familiar, Python-like syntax. It is statically typed, though its type system is powerful and ergonomic. It is concurrent by default and non-strict — expressions describe results rather than dictating an order of execution, which leaves the system free to evaluate, reorder, and parallelize work as the data demands. Generators are a primary idiom, keeping computation streaming and composable. Mutability is disciplined: state is either exclusive to a single thread of logic, or concurrent access is managed through transactions. Finally, program inputs and outputs are statically determined, which is what makes program branching possible.
A Cambra program lowers to a simple, high-level core: a unified intermediate representation (IR) that spans domains. “High-level” is important — this unification is not in the sense that every program eventually lowers to LLVM IR or machine code. Our IR works at the level of data, logic, and operational concerns like latency, durability, and resource isolation. This unified core is the part current architectures don’t have. Current tools can’t reason across a whole system because there is no “across”: each component has its own model, so analysis stops at every boundary. Unifying on a single model dissolves those boundaries and illuminates the blind spots they created.
Cambra’s IR is a pure, typed lambda calculus. This foundation lets us adopt cutting-edge ideas, because decades of programming-language research come with it, giving us well-established guarantees and a deep store of techniques. Atop that foundation we raise the level of abstraction beyond that of traditional functional programming. In the IR, collections of data are just functions, and thus first-class objects the compiler reasons about directly — just as a database reasons about relations. That elevation has far-reaching implications. For the programmer, it defaults the level of abstraction to be data and logic, rather than low-level implementation details. For the compiler, it brings database-style optimizations into reach, letting it decide how data is filtered, joined, indexed, and stored. 1
The type system provides additional leverage by combining subtyping, dependent types, and full type inference, a powerful and uncommon pairing. Subtyping with inference keeps the language about as ergonomic to write as a dynamically typed one. Refinement types, a specific kind of dependent type, allow a type to carry a predicate, such as a quantity staying non-negative or referential integrity holding. Because one type system spans the whole program, those predicates compose across component boundaries, playing the role of database constraints. The difference is that database constraints are enforced via runtime checks, while refinement types are verified at compile time, with the proof obligations discharged by an automated solver.
Purity is another critical property — it is what lets us branch a program without fear. A pure core is referentially transparent, so every side effect lives at the boundary of the system, where it can be cleanly encapsulated. A new version of a program can be spun up cheaply, sharing computation and state with the version it came from. Branching the whole application falls out of the design rather than being bolted onto it.
After the compiler has processed the pure-functional IR, the program is lowered to a dataflow graph for execution. Producer-consumer dataflow isn’t novel: databases and data-processing systems use it precisely because it parallelizes and vectorizes naturally; the novelty is what Cambra’s engine is executing. On its own, dataflow tends to be constrained in the wrong ways. Most dataflow systems restrict you to what can be expressed as nodes and edges, then hand you an escape hatch into a general-purpose host language where you can do anything at all, including hold mutable state and reach out to the world. The engine sees the explicit dependencies between operators but not what the operators actually do, nor the implicit dependencies that hides. It is at once too restrictive and not restrictive enough. Cambra inverts both: because the thing being executed is a fully general, pure lambda calculus, there is no expressivity ceiling to escape and no opaque effects to hide. We take a novel approach that formalizes streaming dataflow graphs as a category (the mathematical kind), and then convert programs from the category of lambda calculus programs into that of streaming dataflow graphs. The result: dataflow’s performance and scalability without dataflow’s limits.
Distribution and replication fall into place for the same reasons. Purity and collection-level abstractions make a program straightforward to distribute across machines, to persist for durability, and to replicate for availability. We aren’t inventing new distributed-systems theory here; we’re applying well-understood ideas in a setting that finally makes them easy to apply.
None of these pieces is unprecedented on its own. Modern language design, type theory, database architecture, and distributed-systems practice have each matured over decades of work. Cambra weaves them into one system, and that combination has no real precedent. The opportunity is less a single breakthrough than a synthesis.
The Team and Timing
Words are cheap. The question that matters is whether this particular team can build this particular thing. We have spent the last decade building the systems Cambra generalizes: streaming dataflow at Google, developer infrastructure at Datadog, and a declarative data-processing product at Snowflake that was its fastest-growing feature for years. So, while this work sits at the frontier of several fields of computer science, we believe we are well-situated to interpolate across those frontiers. 2
The rest is timing. Many of the specific techniques this leans on, whether in type theory or database implementation, only recently matured to the point that they can be put into practice. AI also changes the adoption story. New programming languages have always faced a bootstrapping problem: no one knows it, so no one writes it, so no one learns it. Agents don’t have that problem. As recent events have shown, agents can pick up a well-designed, familiar language and produce real code at scale right away, especially with strong feedback loops to keep them on track. Cambra is built for exactly this: a familiar, Python-like syntax, plus the verification and validation loops an agent needs to stay on course. The thing that makes Cambra valuable turns out to be the thing that makes it adoptable.
What’s Next
We’re moving faster than we ever imagined. The language and compiler will be open source soon, and you’ll be able to read it and play with it yourself. It will take time to fill out the features and the standard library, so we’ll be transparent about how much you can rely on it. A production cloud runtime comes after that.
We’ll write again when the code is out, and go into the details this post has kept at arm’s length. If you want to watch it come together, follow along. We’d rather show you than tell you, and we’re nearly at the point where we can.
Footnotes
-
For anyone having flashbacks to database optimizers making bad decisions, rest assured that we’re also acutely aware of that problem. We believe it is not inherent to the domain, and are designing Cambra’s abstractions to be pierceable instead of opaque. ↩
-
That said, we would benefit from more expertise in type theory and formal verification. If you have a background in these fields and are interested in this work, reach out! We’re actively looking for collaborators. ↩