DOCUMENTATION

Overview

Evaluation domains in DEC Bench.

Domains define the business contexts used to evaluate agent performance on realistic data engineering work.

v0.1: Foo Bar Domain

The v0.1 release ships exclusively on the Foo Bar synthetic domain. This uses dummy data with no production business semantics, keeping the focus on data engineering competency rather than domain knowledge.

  • Foo Bar (Dummy): 36 scenarios (13 tier-1, 18 tier-2, 5 tier-3) covering ingestion, schema design, query optimization, debugging, transformation, migration, reliability, data quality, and end-to-end pipelines.

Coming Soon

Future releases will add production-realistic domains:

DomainFocusStatus
B2B SaaSProduct usage events, subscription lifecycle, feature adoptionPlanned for v0.2
B2C SaaSUser activity streams, content interactions, session dataPlanned for v0.2
UGCPosts, comments, reactions, moderation signalsPlanned for v0.2
E-commerceOrders, inventory, catalog, customer behaviorPlanned
AdvertisingImpressions, clicks, conversions, bid dataPlanned
Consumption-Based InfraAPI calls, compute usage, storage meteringPlanned

How To Use These Pages

Start with the domain that best matches your workload profile, then map its constraints to competency evaluations under Competencies.