v1.7.0 inbound load test
This is the operator runbook for tools/loadtest/inbound_v1_7.js,
the k6 script that exercises the five Pipe B inbound endpoints
(sales_orders, items, customers, vendors, purchase_orders)
with realistic source payloads under concurrent load.
The script is operator-run, not part of CI. Run it from a workstation against a staging stack pre-merge to confirm pre-merge gate 25 ("v1.7 inbound endpoints sustain a realistic burst without 5xx leakage or chain-fork regressions") still holds. The chain serialization fix (#271) and the body-cap boot guard (#273) cover correctness; this runbook covers throughput and tail latency.
Prerequisites
- k6 installed on the runner machine. The Linux / macOS install:
# macOS
brew install k6
# Debian / Ubuntu
sudo gpg -k && \
sudo gpg --no-default-keyring \
--keyring /usr/share/keyrings/k6-archive-keyring.gpg \
--keyserver hkp://keyserver.ubuntu.com:80 \
--recv-keys C5AD17C747E3415A3642D57D77C6C491D6AC1D69 && \
echo "deb [signed-by=/usr/share/keyrings/k6-archive-keyring.gpg] \
https://dl.k6.io/deb stable main" | sudo tee /etc/apt/sources.list.d/k6.list && \
sudo apt-get update && sudo apt-get install k6
-
Sentry stack reachable at
SENTRY_BASE_URL(defaulthttp://localhost:5000). The stack must have the v1.7 schema loaded (mig 047 + mig 048) and a valid mapping doc on disk forSENTRY_SOURCE_SYSTEM. Boot fails loudly otherwise. -
A WMS token issued with:
inbound_resourcescovering the five resources (or a subset matching what the script targets; the default script targets all five).source_systemmatchingSENTRY_SOURCE_SYSTEM.mapping_overrideleft off (v1.7.0 rejects requests withmapping_overridesregardless; #269).
Issue via POST /api/admin/tokens from the admin panel; copy the
plaintext returned in the response into SENTRY_WMS_TOKEN.
- Allowlist row for
SENTRY_SOURCE_SYSTEMininbound_source_systems_allowlist(the boot guard fails loud otherwise; see #267).
Running
Default profile (5 VUs warmup, 20 VUs realistic peak, 30s cooldown):
SENTRY_BASE_URL=http://localhost:5000 \
SENTRY_WMS_TOKEN=<plaintext> \
SENTRY_SOURCE_SYSTEM=acme-erp \
k6 run tools/loadtest/inbound_v1_7.js
Stress profile (10 VUs warmup, 50 VUs peak, 30s cooldown) for regression hunting:
LOADTEST_PROFILE=stress \
SENTRY_BASE_URL=http://localhost:5000 \
SENTRY_WMS_TOKEN=<plaintext> \
SENTRY_SOURCE_SYSTEM=acme-erp \
k6 run tools/loadtest/inbound_v1_7.js
The script writes loadtest-summary.json alongside the standard
stdout summary so trend tracking across runs is straightforward.
Thresholds
Default profile:
| Metric | Threshold |
|---|---|
http_req_failed{kind:5xx} rate |
< 0.1 % |
http_req_duration{endpoint:*} p95 |
< 300 ms |
http_req_duration aggregate p99 |
< 1000 ms |
Stress profile relaxes to p95 < 800 ms and p99 < 2000 ms because the 50-VU shape contends for the gunicorn worker pool.
Investigating threshold trips
A failed run does not by itself indicate a regression -- a slow laptop, network jitter, or a stack mid-deploy can all push p95. Triage in this order:
- 5xx rate > 0. Check the stack's audit_log + application log
for the request that failed. A 500 inside the chain trigger is
the highest-priority signal -- mig 047 should serialize the
inserts; a single 500 here suggests a fork escape and warrants
a
verify_audit_log_chain()run. - p95 trip on a single endpoint. Likely a mapping-doc shape issue (a derived expression scanning a large array, a cross-system lookup hitting a cold index). Profile the handler against the same payload outside k6.
- p99 trip aggregate, p95 fine. Tail latency in the connection
pool or the audit_log lock. Confirm with
pg_stat_activityduring a re-run and look for trigger waits onaudit_log_chain_head.
Expected baselines (apartment-lab fixtures, default profile)
These are reference numbers from the v1.7.0 pre-merge gate; treat them as "if this run matches, gate 25 holds" rather than as hard contracts.
| Metric | Apartment-lab baseline |
|---|---|
| Total iterations across the 120s run | 6,000-8,000 |
http_req_duration p95 aggregate |
80-150 ms |
http_req_duration p99 aggregate |
200-400 ms |
http_req_failed{kind:5xx} rate |
0 % |
Stress profile typically lands p95 around 250-400 ms with no 5xx on a 4-worker gunicorn stack. A 5xx rate above zero in the stress profile is also a regression worth investigating.
When to run
- Before every v1.7.x merge to
main-- gate 25 is mandatory. - After any change to:
services.inbound_service(handler hot path)services.mapping_loader(apply()is on the hot path)db/migrations/047_audit_log_chain_serialization.sql(or any audit_log trigger change)services.token_cache(auth hot path)
- After any Postgres upgrade or change to gunicorn worker count / connection pool size.
Why this is operator-run, not in CI
CI runs in shared GitHub Actions infrastructure; load test results from a shared runner with unpredictable neighbor noise are unstable enough to be net-misleading. A dedicated workstation against a staging stack with consistent neighbor profile produces a number that's actually comparable run-over-run.