Back Original

Show HN: Rocky – Rust SQL engine with branches, replay, column lineage

Rocky

Engine CI Dagster CI VS Code CI License: Apache 2.0

The trust system for your data. A Rust-based control plane for warehouse pipelines: branches, replay, column-level lineage, compile-time safety, per-model cost attribution. Keep Databricks or Snowflake. Bring Rocky for the DAG.

Rocky quickstart — create a project, compile, and run 3 models in under 15s

# macOS / Linux
curl -fsSL https://raw.githubusercontent.com/rocky-data/rocky/main/engine/install.sh | bash

# Windows (PowerShell)
irm https://raw.githubusercontent.com/rocky-data/rocky/main/engine/install.ps1 | iex
rocky playground my-first-project
cd my-first-project
rocky compile && rocky test && rocky run

No credentials needed — the playground runs end-to-end on local DuckDB.

Each demo below is a self-contained POC in examples/playground/pocs/cd in, run ./run.sh, reproduce locally.

Detects schema drift the moment it happens

A source column type changes upstream. On the next run, Rocky diffs source vs. target, drops the target, and recreates it. No silent data corruption, no dbt-style quiet divergence.

rocky run detects source type change and recreates the target

POC — 02-performance/06-schema-drift-recover

Enforces data contracts at compile time

Missing required columns, protected columns being removed, or unsafe type changes surface as diagnostic codes (E010, E013) before a single row is written.

rocky compile flags E010 and E013 contract violations on broken_metrics

POC — 01-quality/01-data-contracts-strict

Named branches for risk-free experiments

Create a branch, run against it in an isolated schema, inspect, then drop or promote. Column-level lineage shows the downstream blast radius before you ship.

rocky branch create, run on branch, and trace column lineage downstream

POC — 00-foundations/06-branches-replay-lineage

Column-level lineage, not table-level

Trace a single column from a downstream fact back through its aggregations, all the way to the seed. Blast-radius analysis without reading every model.

rocky lineage --column traces fct_revenue.total back to seeds.orders.amount

POC — 06-developer-experience/01-lineage-column-level

AI model generation with a compile-validate loop

Describe what you want in plain English. Rocky generates a Rocky DSL model, compiles it, and retries on parse failure — the Attempts: 2 line shows the loop catching a first-pass error invisibly.

rocky ai generates a .rocky model from natural language intent, Attempts: 2

POC — 03-ai/01-model-generation

Path Artifact Language Description
engine/ rocky CLI binary Rust Core SQL transformation engine — 20-crate Cargo workspace
integrations/dagster/ dagster-rocky PyPI wheel Python Dagster resource and component wrapping the Rocky CLI
editors/vscode/ Rocky VSIX TypeScript VS Code extension — LSP client + commands for AI features
examples/playground/ (config only) TOML / SQL Self-contained DuckDB sample pipeline used for smoke tests and benchmarks

Each subproject has its own README with detailed usage. The engine/README.md is the canonical product reference for the Rocky CLI.

git clone https://github.com/rocky-data/rocky.git
cd rocky
just build       # builds engine + dagster wheel + vscode extension
just test        # runs all test suites
just lint        # cargo clippy/fmt + ruff + eslint

just is optional — you can also build each subproject directly. See CONTRIBUTING.md for per-subproject build commands.

Each artifact is released independently using a tag-namespaced scheme:

  • engine-v* → Rocky CLI binary (cross-compiled, on GitHub Releases)
  • dagster-v*dagster-rocky wheel
  • vscode-v* → Rocky VSIX

See CONTRIBUTING.md for the full release flow.

Full documentation: rocky-data.dev — concepts, guides, CLI reference, Dagster integration, adapter SDK.

See CONTRIBUTING.md. Before opening a PR, please read the cross-project change guidance — schema and DSL changes must update consumers atomically.

Rocky is free and open source. If it saves your team time, consider sponsoring the project so development can continue.

Apache 2.0