# Runtime Closed-Loop E2E Standard

This standard owns the first closed-loop validation model for the AI engineering
runtime. It complements `ai/standards/local-dev-runtime.md`,
`ai/standards/operator-questions.md`, and
`ai/standards/trusted-operator-daemon.md`; do not duplicate their detailed
runtime policy here.

## Test Levels

- `fixture-only`: deterministic `/tmp` state, simulated Telegram decisions,
  fixture git repos, fixture supervisor status, and no live Telegram or trusted
  tmux requirement. These tests must run in ordinary CI/sandbox contexts.
- `trusted-runtime`: canonical tmux sessions, trusted daemon, runtime
  supervisor, managed dev servers, and scorecard truth from the trusted runtime.
- `live-telegram`: real bridge transport tests. These are optional/manual unless
  bridge configuration is present and healthy.
- `browser-screenshot`: managed server URL plus canonical Playwright path with
  screenshots written to `/tmp`.

## Continue/Stop Rule

Continue the orchestration only when all of these are true:

- scorecard has no unresolved stale/unconsumed approval state.
- required fixture-only E2E tests pass.
- acceptance criteria and QA review are green.
- missing-action registry has no open P0/P1 action that blocks the current run.
- trusted-runtime status is healthy or any degraded area is explicitly
  non-blocking for the current scope.

Stop and write a handoff when any of these are true:

- ambiguity or strategy decision is required.
- safety/security boundary is unclear.
- a repeated failure or failed E2E cannot be fixed within scope.
- a required registered action is missing.
- runtime supervisor or trusted daemon is unavailable and needed.
- stale/unconsumed operator question or Telegram decision state remains.
- approved-but-unexecuted or stale approved-action state remains without an
  explicit resume/fresh-approval decision according to
  `ai/standards/operator-questions.md`.
- final summary cannot be generated in the canonical
  `Details -> Good -> Bad -> Ugly -> Validation -> Next` shape.

The stop reason must be recorded in the chunk handoff and, at run boundaries,
in the Telegram/local run summary.

## Timeline

Closed-loop tests may write a JSONL timeline for compact evidence. The
canonical runtime timeline source is rendered by
`ai/tools/action-timeline/list.sh`; do not create parallel timeline renderers.
Fixture timelines must live under `/tmp` during tests. Event types should be
stable and human-readable, for example:

- `test`
- `tool_command`
- `question`
- `approval`
- `daemon_action`
- `supervisor`
- `failure_path`
- `recovery`
- `summary`

Use `ai/tools/runtime-e2e/timeline.sh` for fixture timelines. Do not build a UI
or telemetry system around this until the file-based loop proves insufficient.

Operator-facing timeline output should use:

```sh
ai/tools/action-timeline/list.sh --human
ai/tools/action-timeline/list.sh --json
ai/tools/action-timeline/list.sh --telegram
```

Default human and Telegram views suppress noisy low-value events. Use `--all`
only for debugging.

Timeline retention is owned by:

```sh
ai/tools/action-timeline/archive.sh --dry-run
ai/tools/action-timeline/archive.sh
```

The active timeline should stay small and fast to review. Rotation archives
older events into date-based files under `.tmp/action-timeline/archive/` without
discarding history. Default views read the active timeline; filtered or `--all`
views may include archive history for debugging.
