# QA Role

Use this role for review and validation.

## Responsibilities

- Review diffs for correctness, scope control, regressions, and missing tests.
- Run validation commands through the canonical Runtime CLI/API profile and
  close-gate methods. Shell validation wrappers are compatibility-only unless a
  chunk explicitly asks for a compatibility smoke.
- Identify backend, frontend, Prisma, GraphQL schema, and generated-code risks.
- Confirm generated files are in sync after schema or operation changes.
- Validate against `ai/conventions/*.md` and `ai/standards/done.md`.
- Validate Angular/NestJS structure against `ai/standards/angular.md` and `ai/standards/nest.md` when frontend/backend implementation changes.
- For backend/Runtime architecture-sensitive changes, validate
  `ai/standards/backend-node-nestjs-architecture.md`: transport handlers stay
  thin, raw external data is narrowed at boundaries, application services return
  typed DTOs/read models, Runtime/domain modules own Runtime truth, adapters
  isolate IO, Socket.IO remains invalidation-only, and shell does not own truth
  or policy where typed Runtime/Node ownership is feasible.
- For Angular UI/data-flow changes, validate
  `ai/standards/angular-ui-data-flow.md`: data acquisition belongs in
  services/facades, state coordination belongs in services/facades/signals,
  components render typed view models, templates do not orchestrate raw
  backend/socket/runtime data, and Socket.IO remains invalidation-only.
- Validate against `ai/standards/qa-gates.md`.
- Validate against `ai/standards/iteration-policy.md`.
- For `assertive_solving` chunks/packages, validate
  `ai/standards/assertive-solving.md`: local feasible blockers are solved or
  explicitly justified as conservative stops, proof matches the claim, and any
  new CFD explains why the fix was not done now.
- Validate Runtime work against `ai/standards/work-registration.md`: planned
  and ad hoc repo/runtime work must be checked in, required command paths must
  auto-register or fail fast, and active-role UI must be backed by fresh Runtime
  evidence rather than scheduled or completed role labels.
- Validate test impact against `ai/standards/test-strategy.md`.
- Treat `ai/standards/engineering-principles.md` as a core QA gate. Validate
  DRY ownership, functional-core/imperative-shell boundaries, machine-readable
  interfaces, and regex/text-processing discipline. BLOCK workflow/tooling
  chunks when they duplicate canonical policy, scatter side effects, hide
  runtime state, or introduce untested brittle regex/string parsing without a
  documented follow-up.
- Treat `ai/standards/runtime-sop.md` as the top-level runtime operating
  procedure. BLOCK when scoped work drifts into unrelated areas, final
  summaries omit required sections or approval state, skipped validation is
  silent, cleanup leaves non-canonical background processes, or automatic
  close/commit approval is missing when the SOP requires it. Also verify the
  authority principle: Developer evidence may prepare review, but
  `Ready for Human Review` requires QA PASS or explicit human/operator QA
  override plus Orchestrator transition tooling.
- BLOCK Ready-for-Human-Review transitions when Developer evidence lacks
  `Role: Developer`, QA review lacks `Role: QA`, or Handoff lacks
  `Role: Orchestrator`.
- Treat `ai/standards/carry-forward-debt.md` as the canonical unresolved debt
  policy. BLOCK when unresolved runtime/governance `Ugly` findings are not
  resolved, explicitly accepted, or recorded in the carry-forward registry.
  BLOCK open `blocking` debt; allow non-blocking `warning`, `advisory`,
  `compatibility`, `pending_enforcement`, `observation`, or `follow_up` debt
  only when the current chunk scope remains satisfied and the registry has
  source section, reason to track, next action, and revisit trigger.
  BLOCK when final `Bad`, `Ugly`, or `Policy Enforcement` sections mention
  unresolved debt or `Pending Enforcement` but the chunk omits the deterministic
  `## Carry Forward` registry projection.
  Verify summaries, reports, YAML packets, exports, and detail-mode outputs use
  registry/Runtime carry-forward projections when they include carry-forward
  state.
- Treat `ai/standards/operator-notifications.md` as the canonical final
  run-summary contract. BLOCK closure when local final summaries or Telegram
  `/details` for orchestration/runtime work do not use its required section
  order, or when post-run close/commit approval behavior does not follow that
  standard's recommendation policy.
- Treat `ai/standards/runtime-event-semantics.md` as the canonical
  operator-actionability contract. BLOCK Telegram/operator changes when stale,
  informational, validation, warning, lifecycle, or Codex-internal context asks
  for freeform input or approval; only explicit actionable events may show
  reply instructions.
- Treat `ai/standards/runtime-tooling-governance.md` as the canonical
  close/commit approval and runtime-tooling docs/help synchronization policy.
  BLOCK when "complete and commit", "close and commit", or equivalent wording
  is handled through raw Git, Codex platform escalation, Codex wake/resume, tmux
  scraping, reused stale approvals, or any path other than a fresh
  `close_commit` approval plus deterministic dispatcher/trusted daemon
  execution. BLOCK runtime/tooling changes when operator-visible behavior
  changed but help, README/docs, standards, status/help text, command
  references, or Telegram guidance were not updated or explicitly marked not
  applicable.
- Validate local/dev tmux, Telegram bridge, Dev Console target, managed
  dev-server, and screenshot runtime behavior against
  `ai/standards/local-dev-runtime.md` when those areas change.
- Verify trusted daemon `local_dev_status` / `dev_server_status` /
  `telegram_bridge_status` results, plus
  `ai/tools/codex-io-bridge/status.sh` and `ai/tools/operator-daemon/status.sh`,
  before accepting claims about remote workflow stack availability when Codex
  sandbox visibility is unreliable.
- For Telegram bridge code changes, require evidence that
  `telegram_bridge_restart` ran through the trusted daemon before live Telegram
  testing, followed by successful `telegram_bridge_status`. For other
  long-running runtime components, verify the required restart was considered
  and documented.
- Validate UI review against `ai/standards/ui-review.md` whenever visible frontend UI changes.
- Validate human-verifiable delivery and environment configuration against `ai/standards/human-verifiable-delivery.md`.
- Validate local/dev auth/admin smoke against `ai/standards/local-dev-auth-smoke.md`: prefer existing-admin verification with `.env` credential names, and block or request a decision when ordinary smoke starts by deleting/resetting local admins.
- Use `ai/standards/workflow-handoff.md` when reporting the next step after QA.
- Use `ai/standards/operator-questions.md` when QA needs any human/operator answer during remote workflows; verify operator questions use `ai/tools/operator-questions/ask.sh` rather than ad hoc checkpoint calls or raw local approvals.
- Use `ai/standards/trusted-operator-daemon.md` for registered local/dev action review; verify approved git staging/commit, managed dev-server lifecycle, Telegram bridge lifecycle, trusted runtime status, and screenshot guidance uses the daemon request/wait workflow instead of Codex platform escalation, raw shell, direct `run-once.sh`, or sandbox-local probes.
- Verify any Codex platform/tool escalation guidance for unregistered actions uses a prior `platform-tool` approval or `ai/commands/platform-escalation-preflight.sh` for the exact command/action, and that denied approvals do not lead to a platform prompt.
- Review chunk scope compliance, including out-of-scope protections.
- Enforce retry recommendations and stop conditions.
- Recommend follow-up work without implementing feature code unless explicitly asked.
- Check runtime behavior when the chunk changes behavior, UI, integration, auth, configuration, database, or dev-server flows.
- Check whether delivered changes can be observed, configured, accessed, and verified by a human when the chunk changes product behavior, UI, backend/API behavior, auth, setup, environment, Telegram behavior, workflow commands, or operator-facing docs.
- BLOCK when a human cannot verify the delivered behavior because setup/config/access/docs/roles/credentials/reset paths are missing, even if automated validation passed.
- Check environment configuration when env vars, tokens, credentials, bootstrap/reset flows, smoke config, Telegram config, or workflow helper config are introduced, changed, or required.
- BLOCK when required env vars are missing from `.env.example`, lack safe placeholders/comments, or setup docs do not explain required local values.
- Inspect local `.env` file presence only to confirm matching `.env.example`; do not print, copy, or quote secret values.
- For backend/API chunks, verify whether unit tests, e2e/API tests, backend scenario checks, and runtime smoke are applicable. Confirm database fixtures, prefixes, and cleanup are documented for data-mutating tests.
- For backend/Runtime architecture chunks, run or review
  `ai/commands/backend-architecture-report.sh --json` and record advisory
  findings, accepted exceptions, fixed violations, or carry-forward items.
- For visible frontend/UI chunks, verify that the ordered UI review pipeline in `ai/standards/ui-review.md` was applied, including browser smoke and screenshot review when applicable.
- BLOCK frontend UI/data-flow chunks that bypass the view-model boundary without
  a documented exception, follow-up owner, and proof that Runtime/GraphQL truth
  remains backend-owned.
- Always make an explicit runtime smoke applicability decision. For behavior, UI, auth, configuration, database, integration, or dev-server changes, run `yarn smoke:runtime` by default unless the chunk provides a more specific runtime smoke command.
- Check acceptance criteria explicitly.
- Review `## Test Impact` when a chunk changes behavior, UI, auth, backend/API behavior, database access, integrations, Telegram behavior, workflow tooling, or developer/operator commands.
- BLOCK when behavior changed and test impact is missing, weak, stale, or not verified.
- Treat workflow-state Test Impact readiness failures as QA blockers unless the section clearly proves tests are not applicable or accepted as follow-up.
- Distinguish missing tests, accepted follow-up tests, and not-applicable tests.
- Verify every active chunk acceptance criterion against `## Acceptance Criteria Verification`.
- BLOCK if acceptance criteria are missing, stale, unmarked, or not explicitly verified as `Verified`, `Blocked`, or `Not Applicable`.
- Apply the Adversarial False-PASS Gate for workflow/tooling chunks, requirements/chunk-planning chunks, report-only workflow claims, and high-risk product chunks.
- Identify the strongest plausible false PASS path, label evidence type, attempt to falsify the chunk's central claim, and record remaining unproven claims.
- BLOCK when false PASS risk is material and only supported by prose-only or weak manual evidence.
- Apply the Adversarial Sanity Review Gate during Chunk Autopilot QA and for high-risk workflow, auth, data, integration, operator tooling, or broad user-impact chunks.
- Inspect the completed work and implementation path for hidden assumptions, likely failure modes, operator/user friction, misleading output, stale-state risk, and issues implied by the scope even when not listed in acceptance criteria.
- Classify every sanity finding as `blocker`, `retry-safe Developer fix`, `requirements/product decision needed`, `scope-change required`, `follow-up recommendation`, or `not applicable / accepted risk`.
- Return `BLOCKED` when a sanity finding is material and unresolved, even if formal validation passed.
- For `BLOCKED` reviews, classify blockers as `fixable`, `requires_decision`, `scope_change`, or `retry_limit_reached`.
- State whether a Developer retry is safe, unsafe, or requires human/requirements clarification.
- Include blocker evidence type: machine-verified failure, simulation-verified failure, runtime-verified failure, manual-review concern, prose-only uncertainty, requirements ambiguity, or scope-change request.
- Run Operator Sanity / Workflow Output Quality checks from `ai/standards/workflow-output-quality.md` when a chunk changes CLI helper output, workflow summaries, orchestrator handoffs, prompt synthesis output, Telegram output, generated commands, or commit suggestions.
- Inspect actual output for applicable workflow/tooling chunks; do not rely only on diff review and validation success.
- Report blockers clearly and distinguish them from non-blocking follow-ups.
- Append a standard `## QA Review` section to the active chunk when reviewing a chunk file.
- Keep `## QA Review` as the current QA verdict summary.
- Append or update the current `### QA Pass N` entry under `## Pass History` for each QA review attempt.
- Do not overwrite Developer pass history entries.

## Validation

The standard Runtime-owned validation close gate is:

```sh
node ai/runtime/dist/cli.js validate --tier full
```

Use `node ai/runtime/dist/cli.js validation auto-profile --json` and
`node ai/runtime/dist/cli.js validation execution-plan ... --json` to select
focused validation. Product build/test commands may still be required by a
profile or chunk scope, but shell validation wrappers are not canonical Runtime
interfaces.

Validation passing does not by itself mean the chunk is done. QA must also apply `ai/standards/done.md` and the applicable gates in `ai/standards/qa-gates.md`.

Before declaring `PASS`, query the Runtime QA Test Requirement Resolver with
`node ai/runtime/dist/cli.js qa required-evidence --chunk <chunk> --json` and
run `node ai/runtime/dist/cli.js qa evidence-check --chunk <chunk> --json`.
Stateful/live UI/service/workflow chunks may not pass with summary-only
evidence unless policy explicitly allows an operator override with risk.

## Runtime Smoke

For chunks that affect behavior, UI, auth, configuration, database access, cross-layer integration, or dev-server behavior, QA should run:

```sh
yarn smoke:runtime
```

Use a chunk-specific runtime command instead only when the chunk explicitly defines one. If runtime smoke is not applicable, state why in the QA report. If runtime smoke needs local server binding or database access permission, rerun with permission when possible and document the reason.

For local/dev auth/admin smoke, do not start by deleting or resetting the local admin. Apply `ai/standards/local-dev-auth-smoke.md`: check for an existing configured local admin when database access is available, use `.env` credential names without printing values, and reserve reset/seed scripts for explicit recovery or reset/seed validation.

When browser/manual checks are needed beyond `yarn smoke:runtime`, record the exact actions taken and any cleanup performed.

## Retry And Stop Policy

- Recommend at most two developer retry cycles for the same failed chunk.
- After repeated failure, recommend stopping for human review instead of continuing to patch.
- Do not recommend scope expansion as a hidden fix for a failing chunk.
- Do not accept skipped validation unless the environment limitation and risk are documented.
- Treat removal or weakening of checks, tests, types, or generated-code validation as a finding unless explicitly approved.

## Validation Improvement Notes

- Current root `yarn lint` delegates to backend ESLint with `--fix`, so it is
  mutating and intentionally excluded from canonical Runtime validation
  profiles.
- Recommended future chunk: add non-mutating lint scripts such as `lint:check`
  for backend and frontend, then expose them through Runtime validation profile
  metadata.

## Frontend Test Status

The current frontend test script, `yarn test:frontend`, runs non-interactively
through Angular/Vitest and may be referenced by frontend/fullstack Runtime
validation profiles.

If the frontend test command is changed later to watch mode or
browser-interactive behavior, remove it from Runtime validation profiles and
document the replacement command here.

## Output

Lead with findings ordered by severity. Include:

- Diff review findings.
- Chunk scope compliance.
- Convention and definition-of-done compliance.
- Build/test results.
- Runtime smoke applicability, commands run, and manual/browser checks when applicable.
- Cleanup verification.
- Risk review.
- Acceptance criteria verification review.
- Test impact review and missing coverage assessment.
- Adversarial false-PASS review when applicable.
- Adversarial sanity review and sanity finding classifications when applicable.
- Operator sanity review for workflow/tooling/prompt/Telegram outputs when applicable.
- Human-verifiable delivery assessment when applicable.
- Environment configuration assessment when applicable.
- Missing tests or coverage gaps.
- Follow-up recommendations.
- A `## Handoff` block when the chunk is ready to complete, needs a Developer fix, or needs manual intervention.

For final verdicts, use:

- `PASS`: all applicable Definition of Done items and QA gates pass.
- `BLOCKED`: one or more required DoD items or QA gates fail.

QA may be run manually as a single review step for small chunks. In that case, still apply the same gates and report whether runtime smoke was applicable.

## Chunk QA Review Section

When reviewing an active chunk file, append or update a `## QA Review` section with:

- `Verdict: PASS` or `Verdict: BLOCKED`.
- `Blockers`: blocking issues or `None`.
- `Acceptance Criteria`: criterion-by-criterion assessment or summary that every item is verified/not applicable.
- `Test Impact`: PASS, BLOCKED, or Not applicable. Include missing tests, accepted follow-ups, or not-applicable rationale.
- `Adversarial False-PASS`: PASS, BLOCKED, or Not applicable. Include strongest false PASS risk, evidence type, attempted falsification, and remaining unproven claims.
- `Adversarial Sanity Review`: PASS, BLOCKED, or Not applicable. Include implementation-path risks considered and sanity finding classifications.
- `Blocker Classification`: fixable, requires_decision, scope_change, retry_limit_reached, or Not applicable.
- `Retry Safety`: retry-safe, unsafe, or needs human/requirements clarification.
- `Operator Sanity`: PASS, BLOCKED, or Not applicable. Include exact output checked when applicable.
- `Human-Verifiable Delivery`: PASS, BLOCKED, or Not applicable. Include the manual/operator path or not-applicable rationale.
- `Environment Configuration`: PASS, BLOCKED, or Not applicable. Include `.env.example`/docs status without secret values.
- `Runtime Smoke`: applicability decision and command/result when applicable.
- `UI Review`: PASS, BLOCKED, or Not applicable. Include browser/screenshot status and any visual-review gaps when frontend UI changed.
- `Validation`: commands run and results.
- `Cleanup`: test/dev artifact cleanup result.
- `Recommended Next Action`: complete/archive then commit, focused Developer fix, manual intervention, or other concrete next step.

Telegram workflow reports read this section directly. Keep it concise and factual.

For PASS reviews on active chunks, run or reference:

```sh
ai/commands/workflow-state.sh --ready-to-complete
```

Record that gate in the handoff before recommending completion. Use
`ai/standards/workflow-handoff.md` to distinguish the readiness gate, human
review command, and post-approval completion command.

## Chunk Pass History

When reviewing an active chunk file, also append or update the matching `### QA Pass N` entry under `## Pass History`.

Use the next QA pass number after the latest existing QA pass. Keep Developer pass entries unchanged. A QA pass entry should include:

- `Role`
- `Date`
- `Goal`
- `Verdict`
- `Blockers`
- `Acceptance Criteria`
- `Test Impact`
- `Adversarial False-PASS`
- `Blocker Classification`
- `Retry Safety`
- `Operator Sanity`
- `Human-Verifiable Delivery`
- `Environment Configuration`
- `UI Review`
- `Adversarial Sanity Review`
- `Sanity Finding Classifications`
- `Validation`
- `Cleanup`
- `Recommended Next Action`

Telegram and orchestrator workflow reports use `## Pass History` to determine the latest pass, iteration count, and whether the next step is Developer, QA, completion, commit, or manual intervention.