# Runtime Health Performance Profile

Generated: 2026-05-17
Source chunk: `chunk-000208-runtime-health-performance-profile-nplusone-decision`

## Summary

Runtime Action Center load/refresh latency is currently dominated by Runtime
health-snapshot generation and the backend proxy calling that Runtime CLI/API
path. The measured path does not show a GraphQL database N+1 problem, so
DataLoader/request batching is not appropriate for this chunk.

## Measurements

All measurements were taken from the repository root with the current local
Runtime and backend build artifacts.

| Surface                                                 | Samples |       Min |    Median |       Max | JSON bytes | Gzip bytes |
| ------------------------------------------------------- | ------: | --------: | --------: | --------: | ---------: | ---------: |
| Runtime CLI `runtime health-snapshot --json`            |       5 | 5940.4 ms | 6252.2 ms | 7103.1 ms |      57336 |      12828 |
| Backend `AdminRuntimeHealthService.snapshot()`          |       5 | 5697.0 ms | 5993.7 ms | 6475.0 ms |      31566 |       7927 |
| Backend `AdminRuntimeHealthService.exportBundle()`      |       5 | 6024.9 ms | 6141.2 ms | 7487.6 ms |      50755 |       9167 |
| Backend `AdminRuntimeHealthService.workflowArtifacts()` |       5 | 5637.2 ms | 5853.3 ms | 6744.1 ms |      33645 |       3785 |

## Bottleneck Classification

- Runtime snapshot generation / Runtime CLI/API roundtrip: primary bottleneck.
- Backend resolver overhead: mostly mirrors Runtime snapshot cost.
- GraphQL payload size: moderate, but gzip reduces the transfer substantially.
- DB N+1: not observed in this path. The Admin Runtime Health service does not
  issue Prisma/DB queries for these measured surfaces.
- Frontend render: not measured as the primary bottleneck because backend-side
  data fetch already takes roughly six seconds.
- Export/handoff query: same bottleneck as Runtime snapshot plus lightweight
  artifact/report assembly.

## DataLoader Decision

Do not add DataLoader for Runtime Action Center health loading in this package.
The measured bottleneck is not repeated GraphQL resolver database access.

## Recommended Optimization Targets

1. Reduce Runtime health-snapshot generation cost.
2. Avoid running the full Runtime snapshot separately for every related export
   request in one UI refresh cycle. Chunk 000208 added a short backend reuse
   window for this.
3. Consider a short-lived backend/request-level snapshot reuse window for
   `adminRuntimeHealth`, `adminRuntimeHealthExport`, and workflow artifact
   exports, if freshness constraints allow it.
4. Split heavy export/artifact data from the initial health query if frontend
   refresh still feels slow after Runtime snapshot cost is reduced.

## Constraints

- Runtime/GraphQL Health snapshot remains canonical dashboard truth.
- Socket.IO remains invalidation-only.
- No dashboard state should be rendered from socket payloads.
- Caching must not hide stale Runtime/service degradation.

## Implemented Low-Risk Optimization

`AdminRuntimeHealthService` now reuses a Runtime snapshot for up to 1500 ms and
coalesces concurrent in-flight snapshot requests. This is intended to cover one
Action Center refresh cycle where the frontend asks for health, handoff export,
and workflow artifact data at nearly the same time.

Post-change smoke:

| Surface | Elapsed |
| --- | ---: |
| Parallel backend `snapshot()` + `exportBundle()` + `workflowArtifacts()` | 6180.4 ms |

The post-change parallel cost is approximately one Runtime snapshot generation,
not three separate sequential Runtime snapshot generations.
