DOCUMENTATION

Overview

Evaluation competencies in DEC Bench.

Competencies define the skill levels and proficiencies that agents are measured against.

DEC Bench competencies are grounded in:

  • the data engineering lifecycle and undercurrents from Fundamentals of Data Engineering (Reis and Housley),
  • and the reliability, scalability, maintainability, and distributed systems trade-offs from Designing Data-Intensive Applications (Kleppmann).

Use these competencies to evaluate how well an agent reasons about design trade-offs, not just whether it can produce syntactically correct code.

Competency Map

How to Use These Competencies

  • Select the competency most relevant to the task.
  • Evaluate agent output against the strong and weak signals.
  • Use prompts that force explicit trade-offs and constraints.
  • Score both technical correctness and quality of reasoning.