Self-healing for dbt warehouses.
Every dbt team has the same incident response loop: a metric goes wrong, Slack lights up, an analytics engineer bisects commits, hand-walks lineage, writes a revert, tests it on a stale dev environment, crosses fingers. The loop is invisible to leadership, expensive to the AE team, and slow enough that the metric stays wrong for hours. Cairn closes the loop end-to-end: anomaly detection, cognition-graph attribution to the offending commit, and a sandbox-verified revert PR. Zero human in the loop unless you ask for one.
Production dbt incidents still run on human memory.
The modern data stack made analytics engineering look like software engineering. dbt put models in Git. CI became normal. Lineage became queryable. Warehouses became programmable infrastructure. Yet the recovery loop after a production metric regression still depends on the person who knows where the bodies are buried.
A revenue metric drops 18 percent. A retention ratio shifts by 6 points. A freshness SLA slips by 4 hours. The alert is the easy part. The hard part is finding the commit, proving the cause, reverting safely, and giving the team enough evidence to trust the fix.
Cairn is built around a specific claim: the wedge is self-healing, not editing. dbt teams do not need another place to type SQL. They need a system that operates their dbt warehouse after code ships.
Cairn auto-detects metric anomalies in your dbt warehouse, attributes them to the offending commit, and opens a verified revert PR. Zero human in the loop.
| Metric | Status | Regression | Top cause | Verdict | PR |
|---|---|---|---|---|---|
| mrr.net_new | Detected | -18.4% vs 28d baseline | 8f31c2a · fct_revenue.sql | Verified revert | PR #482 |
| orders.freshness | Attributing | +231 min lag | 2b7a91d · stg_orders.sql | Clone running | pending |
| refunds.null_rate | Recovered | 9.7% → 0.4% | cc04b10 · dim_payments.sql | Merged | PR #477 |
Inline product mock · no stale screenshots · status pills, metric IDs, verified PR cells
Cairn starts where alerts stop.
The existing market has strong point products, but the closed loop remains unowned. The reason is not product copy. It is infrastructure depth. Detecting an anomaly, attributing it to code, proving a recovery path, and opening a PR all require different systems to agree under production constraints.
Observability detects but does not fix.
Monte Carlo, Anomalo, and dbt Cloud freshness tests can tell teams something broke. The next step is still manual triage and a human-written revert.
Coding agents edit but do not operate.
Cursor, Conductor, and Claude Code can change files. They are not responsible for a production metric contract after merge.
The full loop is hard infrastructure.
Cairn combines warehouse-side sandbox verification, cognition-graph attribution, per-org policy, and Git automation.
The wedge is the moment after production breaks. That moment has budget, urgency, and a clear correctness signal. A team either restored the metric or it did not. A proposed fix either passes against a clone of the customer warehouse or it does not.
This is a better starting point than an editor because the buyer can measure value in incident minutes. If Cairn cuts a 4-hour analytics incident to 12 minutes, the result is legible to the head of data, finance, sales operations, and the executive who used the broken dashboard.
Detect, attribute, recover.
Cairn is a self-healing agentic platform for dbt warehouses. The product is organized around three motions, each with a narrow contract.
Anomaly detection on metric contracts. Teams define the contracts that matter: volume, freshness, row count, ratio, null rate, and distribution shift. Cairn watches the dbt warehouse continuously and compares each metric to its baseline.
Cognition graph plus evidence. Cairn walks lineage from broken metric to upstream models, joins that path to Git history, collects sample-row deltas, and uses an LLM to produce a top cause plus ranked alternates with confidence scores.
Sandbox-verified revert PR. Cairn creates a zero-copy clone, runs the candidate revert against the customer's dbt project, verifies that the metric recovers, and only then opens a PR with the evidence attached.
The design constraint is simple: every revert is verified before any PR is opened. Cairn does not ask customers to trust an LLM guess. It asks them to trust a reproducible run against their own warehouse clone and their own dbt code.
That constraint shapes the product. The PR body carries the incident ID, the affected metric contract, the attribution path, the suspected commit, the clone ID, the dbt command, the before-and-after metric values, and the check result. The code review becomes a decision about evidence, not a scavenger hunt.
Cairn is operations for analytics engineers.
Cursor, Conductor, and Claude Code are editors and coding-agent systems for people who write code. They help analytics engineers produce dbt changes faster. They do not own the production responsibility that starts after those changes merge.
Monte Carlo, Anomalo, and dbt Cloud are monitors for data teams. They can detect freshness failures, test failures, quality anomalies, and pipeline problems. They do not close the recovery loop with a verified code change.
Cairn sits between those two categories. It does not compete to be the place where every dbt model is written. It competes to be the system of record for production dbt incidents and the system that proposes the safest recovery path.
Today: Slack alerts and scrolling dashboards. In Cairn: continuous anomaly detection with metric-contract baselines.
Today: bisect commits, trace lineage by hand, ask who merged last. In Cairn: cognition-graph attribution with a top cause, ranked alternates, and sample evidence.
Today: manual revert PR and a hope that dev resembles prod. In Cairn: sandbox-verified revert against a zero-copy clone.
Today: dashboard refresh ritual. In Cairn: recovery proof attached to the PR.
This is a narrow wedge by design. A narrow wedge makes the promise testable. If a dbt metric regresses because a dbt commit changed the warehouse behavior, Cairn should identify it and recover it. If the incident is outside that scope, Cairn should say so and preserve the evidence trail.
Customers will not start with autonomy. They will earn it.
Self-healing only works if the customer can control the blast radius. Cairn's trust model is built around progressive permissioning.
- Observe mode. Cairn detects incidents, attributes causes, and records what it would have done. No PR is opened. This mode creates offline precision and recall data for the customer.
- Assisted mode. Cairn opens verified revert PRs. A human reviews and merges. This is the alpha default.
- Autonomous mode. Cairn auto-merges verified reverts that match per-org policy. This mode starts behind an allowlist.
Verification runs on a warehouse clone, not on a toy fixture. In BigQuery, table clones can reference a point in time with syntax such as CREATE TABLE ... CLONE ... FOR SYSTEM_TIME AS OF. Google documents table clones as lightweight metadata operations that avoid full data copies until changes occur [BigQuery table clones].
Each incident carries an audit trail: lineage path, suspected commit, ranked alternates, sample-row deltas, clone target, dbt invocation, metric before value, metric after value, and PR state. That matters for compliance and for trust. The record shows what Cairn saw, what Cairn changed, and why the change was safe.
Every organization also gets a kill switch. If confidence drops, a high-priority model is under migration, or the team wants a freeze window, Cairn can be forced back to Observe mode for the whole org or for a specific dbt selector.
The required pieces are finally cheap enough and good enough.
Cairn would have been harder to build five years ago. The product depends on four shifts that have now landed.
Zero-copy warehouse clones are cheap.
BigQuery table clones make incident-scoped verification practical because Cairn does not need to copy terabytes for each candidate revert. The marginal cost of a clone-based verification run can be cents for scoped builds when the clone and dbt selector are constrained.
Sandbox compute makes dbt builds isolated.
Modal, Fly Machines, and similar systems make it straightforward to run short-lived dbt jobs in isolated sandboxes. Cairn can fetch the repo, apply a candidate revert, run the dbt selector, collect artifacts, and tear the environment down.
LLMs are good enough at attribution with the right context.
The model is not asked to guess from a vague alert. Cairn gives it lineage, diffs, compiled SQL, run artifacts, sample deltas, and metric definitions. That framing turns root cause analysis into a bounded ranking problem. Cairn benchmarks attribution against known regressions before a customer grants higher permissions.
GitHub Apps and Checks API make the merge loop respectful.
Cairn does not need to replace the repo workflow. GitHub Apps can open PRs with scoped permissions, Checks can show recovery proof, and branch protection can decide what is allowed to merge [GitHub Checks API].
The wedge sharpens because Cairn says no.
- Not an IDE. Cairn does not try to be the place where analytics engineers write every model. Cairn reverts the broken model after production evidence proves it caused a regression.
- Not a BI tool. Cairn does not dashboard every metric. Cairn reacts when a metric regresses against contracts the team defines.
- Not a generic AIOps platform. Cairn is precisely scoped to dbt warehouses. The first integrations are dbt, BigQuery, GitHub, and warehouse-native clones.
This scope is a product advantage. The narrower the system, the stronger the evidence can be. Cairn can understand dbt manifests, compiled SQL, selectors, exposures, tests, warehouse tables, Git commits, and PR checks as one operational graph.
The roadmap follows trust, then coverage.
Now. BigQuery, revert PRs, Observe mode, Assisted mode, metric contracts, attribution evidence, and sandbox verification before PR creation.
Next. Snowflake and Databricks, Autonomous mode behind a per-org allowlist, escalation PRs for non-revert fixes, and richer policy controls by dbt selector.
Then. Multi-region warehouse support, on-prem mode, longer audit retention, enterprise Git providers, and compliance export for incident evidence.
The important milestone is not the number of integrations. It is verified recovery rate. Cairn should be judged by how often it detects a real regression, attributes the right commit, proves the revert, and gets accepted by the customer.
If your dbt warehouse has production incidents, Cairn should observe the next one.
Cairn is taking a small number of alpha teams on BigQuery first. The best fit is a team with dbt in production, metric contracts or a willingness to define them, and at least one incident in the last 90 days where a broken metric required manual root cause analysis.
Start in Observe mode. Compare Cairn's attribution to your team's incident review. Move to Assisted mode only when the evidence is good enough.