DOCUMENTATION

Release Notes

Release notes and changelog for DEC Bench.

v0.1.0

Released on March 11, 2026.

DEC Bench 0.1 is ready for internal use. This release makes the CLI the default path for installing, running, inspecting, and auditing evals locally.

What's Included

  • 36 scenarios on the Foo Bar synthetic domain
  • 3 harness configurations: base-rt, classic-de, and olap-for-swe
  • built-in runners for Claude Code, Codex, and Cursor
  • installable CLI binaries behind https://decbench.ai/install.sh
  • localhost audit flow through the existing web UI
  • gated scoring across Functional, Correct, Robust, Performant, and Production
curl -fsSL https://decbench.ai/install.sh | sh
dec-bench list
dec-bench build --scenario foo-bar-csv-ingest
dec-bench run --scenario foo-bar-csv-ingest
dec-bench results --latest --scenario foo-bar-csv-ingest

What v0.1 Focuses On

This release is intentionally narrow. The goal is to make the core eval loop solid:

  • install the CLI
  • run real data engineering scenarios locally
  • inspect outputs and artifacts from the terminal
  • open the audit UI for a specific run

The v0.1 release stays on the synthetic Foo Bar domain so we can validate the workflow, scoring, and infrastructure before expanding into production-realistic business domains.

Known Limits

  • Docker is still required at runtime.
  • v0.1 ships on the Foo Bar synthetic domain only.
  • Public leaderboard and broader domain coverage come after this internal release.