DOCUMENTATION
Environment Setup
Evaluate an agent's ability to configure infrastructure, dependencies, and toolchains from scratch.
What this competency is
Bootstrapping a working data engineering environment from an incomplete or broken state. This includes installing dependencies, configuring services, wiring connection strings, and verifying that the stack is healthy before writing any application logic.
Why it matters
Production data engineering starts long before the first SQL query. Agents that cannot stand up infrastructure, resolve dependency conflicts, or debug misconfigured services will stall on real-world tasks. Environment setup is the gate that determines whether any downstream work is even possible.
What to evaluate in agents
- Ability to read error messages and diagnose missing or misconfigured dependencies.
- Strategy for installing and pinning package versions across Python, Node.js, and system packages.
- Configuration of database connections, broker endpoints, and environment variables.
- Health-checking services before proceeding (e.g.,
pg_isready, HTTP pings, broker connectivity). - Awareness of filesystem layout, permissions, and working directory conventions.
Strong signals
- Reads existing configuration files before overwriting them.
- Pins dependency versions to avoid non-deterministic installs.
- Validates service health before moving to the next step.
- Uses environment variables for connection strings instead of hardcoding.
- Handles errors incrementally rather than retrying the entire setup.
Weak signals
- Installs packages without checking what is already available.
- Ignores service health checks and proceeds blindly.
- Hardcodes hostnames, ports, or credentials in application code.
- Treats setup failures as application bugs rather than infrastructure issues.
- Skips reading logs or error output when a service fails to start.
Example evaluation prompts
- "The Postgres container is running but the agent's connection is refused. Diagnose and fix."
- "Set up a Python virtualenv with dbt-core, dbt-postgres, and apache-airflow pinned to compatible versions."
- "Configure supervisord to start Postgres, ClickHouse, and Redpanda, then verify all three are healthy."