DOCUMENTATION

Registry

Browse, search, and filter benchmark scenarios and harnesses.

Use this registry to discover evaluation scenarios by domain, competency, feature, tier, and starting-state model.

Each scenario entry is unique by ID and maps to realistic data engineering workflows documented across the eval criteria pages.

36 scenarios
Fix an analytics HTTP API that fails or returns inconsistent results under concurrent requests. The API uses a single shared ClickHouse connection that is not thread-safe.
Audit
Foo Bar (Dummy)Tier 2Broken / Incomplete1 tasks
Harnesses: base-rt, classic-de, olap-for-swe
Diagnose and fix a misconfigured Postgres setup: wrong connection string, partial init script, and hardcoded credentials in a Python script.
Audit
Foo Bar (Dummy)Tier 1Broken / Incomplete1 tasks
Harnesses: base-rt, classic-de, olap-for-swe
Build a change-data-capture pipeline from Postgres to Redpanda: capture table changes and stream them to a Kafka topic for downstream consumers.
Audit
Foo Bar (Dummy)Tier 2Clean / Greenfield1 tasks
Harnesses: base-rt, classic-de, olap-for-swe
Fix a broken historical backfill pipeline. The target table has incorrect partition or order keys that cause slow inserts and failed backfills for historical date ranges.
Audit
Foo Bar (Dummy)Tier 2Broken / Incomplete1 tasks
Harnesses: base-rt, classic-de, olap-for-swe
Add a new column to a live ClickHouse table with existing data. The migration must not block reads/writes and must backfill the new column for existing rows.
Audit
Foo Bar (Dummy)Tier 2Broken / Incomplete1 tasks
Harnesses: base-rt, classic-de, olap-for-swe
Fix a broken ClickHouse materialized view that fails to refresh. The MV definition has incorrect dependencies and does not match the source table schema.
Audit
Foo Bar (Dummy)Tier 2Broken / Incomplete1 tasks
Harnesses: base-rt, classic-de, olap-for-swe
A ClickHouse table has a suboptimal ORDER BY that causes full scans for common query patterns. Redesign the ORDER BY key to match filter columns (region, event_ts) and improve query latency.
Audit
Foo Bar (Dummy)Tier 1Broken / Incomplete1 tasks
Harnesses: base-rt, classic-de, olap-for-swe
Design a ClickHouse table with TTL so that raw events older than 90 days are automatically dropped. Create the table and verify the TTL expression is correct.
Audit
Foo Bar (Dummy)Tier 1Clean / Greenfield1 tasks
Harnesses: base-rt, classic-de, olap-for-swe
Fix inconsistent metric definitions across ClickHouse and Postgres. The same metric (e.g. daily_active_users) is computed differently in each store, causing reporting mismatches.
Audit
Foo Bar (Dummy)Tier 2Broken / Incomplete1 tasks
Harnesses: base-rt, classic-de, olap-for-swe
Diagnose and fix a Kafka consumer that cannot keep up with producer throughput. Consumer lag is growing; optimize batch size, parallelism, or fetch config so the sink table catches up.
Audit
Foo Bar (Dummy)Tier 1Broken / Incomplete1 tasks
Harnesses: base-rt, classic-de, olap-for-swe
Fix broken reconciliation between Postgres, Redpanda, and ClickHouse. Build a reconciliation job that detects and reports drift (missing rows, count mismatches).
Audit
Foo Bar (Dummy)Tier 3Broken / Incomplete1 tasks
Harnesses: base-rt, classic-de, olap-for-swe
Load five messy CSV files into clean ClickHouse tables. Handle inconsistent dates, mixed null representations, duplicate headers, and trailing delimiters.
Audit
Foo Bar (Dummy)Tier 1Clean / Greenfield1 tasks
Harnesses: base-rt, classic-de, olap-for-swe
Create a semantic layer with derived and composite metrics. Build materialized views or a metrics table that computes conversion_rate, avg_order_value, and daily_active_users from raw events.
Audit
Foo Bar (Dummy)Tier 1Clean / Greenfield1 tasks
Harnesses: base-rt, classic-de, olap-for-swe
Configure a dead-letter queue (DLQ) for a Kafka consumer that fails on malformed messages. The consumer currently drops failed messages; add a DLQ topic and wire the consumer to route failures there.
Audit
Foo Bar (Dummy)Tier 1Broken / Incomplete1 tasks
Harnesses: base-rt, classic-de, olap-for-swe
Fix a broken event-sourcing setup: event log in Redpanda is out of sync with materialized view in Postgres. Replay events to restore consistency.
Audit
Foo Bar (Dummy)Tier 3Broken / Incomplete1 tasks
Harnesses: base-rt, classic-de, olap-for-swe
Diagnose and fix a broken end-to-end pipeline: Postgres CDC to Redpanda to ClickHouse. Multiple failure points including misconfigured connector, wrong topic, and missing ClickHouse table.
Audit
Foo Bar (Dummy)Tier 3Broken / Incomplete1 tasks
Harnesses: base-rt, classic-de, olap-for-swe
Build a pipeline from Postgres to ClickHouse that produces identical results whether run once or three times. Handle updates to existing rows by primary key.
Audit
Foo Bar (Dummy)Tier 1Clean / Greenfield1 tasks
Harnesses: base-rt, classic-de, olap-for-swe
Build a complete pipeline: ingest raw product-interaction events from Postgres, transform and load into ClickHouse, and expose a lightweight HTTP API that serves pre-aggregated insights (top products, conversion funnel, hourly trends).
Audit
Foo Bar (Dummy)Tier 2Clean / Greenfield3 tasks
Harnesses: base-rt, classic-de, olap-for-swe
Fix a failed backfill of a new metric definition. The backfill job partially ran, left orphaned data, and the metric table has gaps or duplicates. Recover and complete the backfill idempotently.
Audit
Foo Bar (Dummy)Tier 2Broken / Incomplete1 tasks
Harnesses: base-rt, classic-de, olap-for-swe
Build a query builder API that supports filtering and grouping by multiple dimensions (date, region, product). Choose appropriate storage (ClickHouse vs Postgres) per data type and expose a unified query interface.
Audit
Foo Bar (Dummy)Tier 2Clean / Greenfield1 tasks
Harnesses: base-rt, classic-de, olap-for-swe
Fix a broken migration from Postgres (OLTP) to ClickHouse (OLAP). The sync job fails, schema is mismatched, and incremental load is not working. Restore the migration pipeline.
Audit
Foo Bar (Dummy)Tier 3Broken / Incomplete1 tasks
Harnesses: base-rt, classic-de, olap-for-swe
Build a paginated analytics API that serves large result sets from ClickHouse with cursor-based or offset pagination. Support page size and consistent ordering.
Audit
Foo Bar (Dummy)Tier 2Clean / Greenfield1 tasks
Harnesses: base-rt, classic-de, olap-for-swe
Add indexes to a Postgres table with 500k rows. Queries filtering by user_id and created_at are slow; create appropriate indexes to meet latency targets.
Audit
Foo Bar (Dummy)Tier 1Broken / Incomplete1 tasks
Harnesses: base-rt, classic-de, olap-for-swe
Fix a startup race: the app connects to Postgres immediately and fails when the database is not yet ready. Add retry/backoff logic so the app waits for Postgres to be ready.
Audit
Foo Bar (Dummy)Tier 1Broken / Incomplete1 tasks
Harnesses: base-rt, classic-de, olap-for-swe
Fix a large unpartitioned Postgres table that is slow for date-range queries. Partition by date and migrate existing data.
Audit
Foo Bar (Dummy)Tier 2Broken / Incomplete1 tasks
Harnesses: base-rt, classic-de, olap-for-swe
Add data quality checks to a working pipeline. A chaos script injects nulls, duplicates, schema drift, and stale timestamps. Agent must detect all four problems without false positives.
Audit
Foo Bar (Dummy)Tier 1Broken / Incomplete1 tasks
Harnesses: base-rt, classic-de, olap-for-swe
Fix a broken realtime metrics pipeline: Redpanda streams metrics but ClickHouse materialized view is stale or failing. Restore sub-second aggregation from stream to OLAP.
Audit
Foo Bar (Dummy)Tier 3Broken / Incomplete1 tasks
Harnesses: base-rt, classic-de, olap-for-swe
Add a column to a synthetic dimension table, create the corresponding ClickHouse target, build a cross-database view, and verify old queries still work.
Audit
Foo Bar (Dummy)Tier 1Clean / Greenfield1 tasks
Harnesses: base-rt, classic-de, olap-for-swe
An HTTP API serves analytical queries from ClickHouse but responses take 2+ seconds. Optimize the queries, add materialized views, or pre-aggregate so /api/metrics and /api/breakdown return in under 200ms.
Audit
Foo Bar (Dummy)Tier 1Broken / Incomplete1 tasks
Harnesses: base-rt, classic-de, olap-for-swe
Rewrite five slow analytical queries against a 10M-row ClickHouse table to run under latency thresholds without changing result sets.
Audit
Foo Bar (Dummy)Tier 1Broken / Incomplete1 tasks
Harnesses: base-rt, classic-de, olap-for-swe
Fix a broken Redpanda topic schema: add a new optional field to the event schema while maintaining backward compatibility for existing consumers.
Audit
Foo Bar (Dummy)Tier 2Broken / Incomplete1 tasks
Harnesses: base-rt, classic-de, olap-for-swe
Consume events from a Redpanda topic and load them into ClickHouse for analytical queries. Build a streaming-to-OLAP pipeline with appropriate table design.
Audit
Foo Bar (Dummy)Tier 2Clean / Greenfield1 tasks
Harnesses: base-rt, classic-de, olap-for-swe
Redesign a naive ClickHouse table with 5M rows to serve three representative query patterns. Set partition keys, order keys, and compression for target latencies.
Audit
Foo Bar (Dummy)Tier 1Clean / Greenfield1 tasks
Harnesses: base-rt, classic-de, olap-for-swe
Fix broken time-grain rollup tables (hourly, daily, monthly) that have incorrect aggregations or missing grains. The rollups should support dashboard queries at different time resolutions.
Audit
Foo Bar (Dummy)Tier 2Broken / Incomplete1 tasks
Harnesses: base-rt, classic-de, olap-for-swe
Fix a misconfigured Kafka topic that causes consumer rebalance storms. The topic has a single partition with high throughput; consumers cannot parallelize and lag grows. Correct the partition count and replication factor.
Audit
Foo Bar (Dummy)Tier 1Broken / Incomplete1 tasks
Harnesses: base-rt, classic-de, olap-for-swe
Build a three-layer transformation from raw JSON events in Postgres to a sessionized, aggregated daily mart in ClickHouse.
Audit
Foo Bar (Dummy)Tier 1Clean / Greenfield1 tasks
Harnesses: base-rt, classic-de, olap-for-swe

Contribute New Scenarios

Scaffold with dec-bench create, register with dec-bench registry add, then open a PR with dec-bench registry publish.

Open Getting Started