DocsAdd a ScenarioWriting Gate Assertions

Documentation

Writing Gate Assertions

Understand core and scenario-specific assertions, what belongs in each gate, and patterns from existing scenarios.

Core vs scenario assertions

Every scenario is evaluated by two layers of assertions:

Core assertions are built into the framework and run for every scenario automatically. You do not write these.

GateCore assertions
Functionalprocess_exits_clean (exit code 0), no_unhandled_errors (no tracebacks or panics in session log)
Robustidempotent_rerun (scenario can be rerun without errors)
Productionuses_env_vars, no_secrets_in_code, no_dead_code_markers, no_debug_artifacts, zero_compiler_errors, zero_lint_errors, has_type_safety, functions_are_focused, no_deep_nesting, and others

The Correct and Performant gates have no core assertions. These are entirely scenario-specific.

Scenario assertions are what you write in assertions/. They test whether the agent solved your specific problem. Each gate file exports named async functions that the framework discovers and runs.

Guidelines

These apply across all gates:

  • One check per function. A function named table_has_rows should not also verify schema.
  • Make failure messages actionable. "Expected 15 rows, got 12" is useful. "Assertion failed" is not.
  • Avoid side effects. Assertions read state, they do not modify it. The robust gate's idempotency check is the exception.
  • Keep assertions fast. Long-running assertions slow down the feedback loop.
  • Use the shared helpers in _shared/assertion-helpers.ts for common checks like hardcoded connections and SELECT * scanning.
  • Browse existing scenario assertions for more patterns.

Gate 1: Functional

Check that the agent produced the expected artifacts. These are typically existence checks.

  • Target table exists in the database
  • Expected files are present
  • Basic row count is non-zero
Example from the foo-bar-csv-ingest scenario
export async function target_table_exists(ctx: AssertionContext): Promise<AssertionResult> {
  // SELECT count() FROM system.tables WHERE database = 'analytics' AND name = 'events'
}

View full assertion file from the foo-bar-csv-ingest scenario

Gate 2: Correct

Validate that the data is accurate. These checks use exact counts, checksums, or value comparisons.

  • Exact row counts match expected totals
  • Checksums match between source and target
  • No null values in required fields
  • Date ranges are valid
  • Cross-system counts agree
Example from the foo-bar-csv-ingest scenario
export async function all_fifteen_events_loaded(ctx: AssertionContext): Promise<AssertionResult> {
  // SELECT count() FROM analytics.events -- expects exactly 15
}
Example from the ecommerce-pipeline-recovery scenario
export async function revenue_checksum(ctx: AssertionContext): Promise<AssertionResult> {
  // Compare SUM(amount) between Postgres source and ClickHouse target
  // Math.abs(sourceSum - targetSum) < 0.001
}

Example assertions: foo-bar-csv-ingest scenario | ecommerce-pipeline-recovery scenario

Gate 3: Robust

Test that the solution is resilient. These checks look for common data problems and structural quality.

  • No duplicate header rows in loaded data
  • No data loss under volume
  • Schema has primary key constraints
  • Required columns are present

Many scenarios also use shared helpers from scenarios/_shared/assertion-helpers.ts:

  • scanWorkspaceForHardcodedConnections(): scans agent-written code for hardcoded connection strings
  • avoidsSelectStarQueries(): scans for SELECT * patterns
Example: shared helper usage
import { scanWorkspaceForHardcodedConnections } from "../../_shared/assertion-helpers";

export async function no_hardcoded_connection_strings(): Promise<AssertionResult> {
  return scanWorkspaceForHardcodedConnections();
}

Example assertions: foo-bar-csv-ingest scenario

See also: shared assertion helpers

Gate 4: Performant

Measure query latency and pipeline throughput against thresholds. Thresholds vary by scenario.

  • Query execution under 100-200ms
  • No SELECT * in production queries
Example from the foo-bar-slow-queries scenario
export async function scan_query_under_100ms(ctx: AssertionContext): Promise<AssertionResult> {
  // Time the query execution, check elapsed < 100
}

Example assertions: foo-bar-slow-queries scenario

Gate 5: Production

Check production-readiness beyond correctness. Most of these are handled by core assertions, but scenarios can add their own.

  • Connection environment variables are available
  • No temporary or test tables left behind
  • Documentation exists
  • Data grain is enforced (e.g. daily grain)
Example from the ecommerce-pipeline-recovery scenario
export async function no_temporary_tables(ctx: AssertionContext): Promise<AssertionResult> {
  // COUNT(*) WHERE table LIKE '%tmp%' = 0
}

Example assertions: ecommerce-pipeline-recovery scenario