Harnesses
Each harness defines the tooling environment a scenario runs against.
Each scenario runs inside a Docker container with pre-installed tools. The harness controls which tools are available to the agent, so you can compare how agents perform with different toolkits against the same problem. v0.1 ships with three.
A model (Claude, GPT, Gemini) paired with an agent CLI (Claude Code, Cursor, Codex) that gives it bash, file I/O, and tool use. Standard agents can be extended with skills, MCPs, and hooks.
Skills, MCPs, and pre-installed tools (dbt, Airflow, MooseStack) that turn a general-purpose agent into a specialized one like data engineering.
Base RT
The control group.No extra tooling beyond the base infrastructure. The agent has Python, Node.js, and CLI access to the three foundational services — ClickHouse (warehouse), Redpanda (streaming), and Postgres (transactional). This is the control group.
Infrastructure
Base infrastructure only: ClickHouse, Redpanda, Postgres, Python 3, Node.js 22
Install script
# Uses base image defaults only
Classic Data Engineering
The standard toolkit.The standard data engineering toolkit. Airflow for orchestration, Spark for distributed processing, and dbt for warehouse transformations. The agent wires together independent best-in-class tools on top of the base infrastructure.
Installed tools
- Apache Airflow 2.10.5
- PySpark 3.5.5
- dbt-core 1.10.19
- dbt-postgres 1.10.0
- dbt-clickhouse 1.10.0
- OpenJDK 17
Install script
apt-get update &&
apt-get install -y --no-install-recommends openjdk-17-jre-headless &&
rm -rf /var/lib/apt/lists/* &&
pip3 install --no-cache-dir --break-system-packages dbt-core==1.10.19 dbt-postgres==1.10.0 dbt-clickhouse==1.10.0 apache-airflow==2.10.5 pyspark==3.5.5
OLAP for Software Engineers
The unified framework.A software-engineering-first approach to data engineering on top of the base infrastructure. MooseStack provides typed schemas, automated migrations, built-in MCP server, and hot-reloading dev environment. The agent works within a unified framework instead of wiring independent tools.
Installed tools
- moose-cli 0.6.424
- moose-lib 0.6.424
Install script
npm install -g @514labs/moose-cli@0.6.424 &&
mkdir -p /opt/dec-bench/moose &&
cd /opt/dec-bench/moose &&
npm init -y &&
npm install @514labs/moose-lib@0.6.424
At a Glance
| Base RT | Classic Data Engineering | OLAP for Software Engineers | |
|---|---|---|---|
| Orchestration | — | Apache Airflow 2.10.5 | — |
| Transformation | — | dbt-core 1.10.19, dbt-postgres 1.10.0, dbt-clickhouse 1.10.0 | — |
| Processing | — | PySpark 3.5.5 | — |
| Framework | — | — | moose-cli 0.6.424, moose-lib 0.6.424 |
| Runtime | — | OpenJDK 17 | — |
| Network | open | open | open |
| Scenarios | 48 | 48 | 48 |
Contribute a harness
Write a shell script, pin versions, and open a PR.
Getting started guide