Sandboxes and test environments
TL; DR
Robust sandbox = complete isolation, synthetic/impersonal data, realistic simulators of external systems, predictable sides and time-travel, built-in idempotency and webhooks, transparent limits and metrics. Food - out of reach, keys - digging, promotion - only on checklists.
1) Environment map and their roles
Rule: sandbox ≠ prod. Any connection - through one-way simulators without access to real means/games/personal data.
2) Data: synthetics, anonymization, sitting
Default synthetics. Passport/card data generators, valid but non-financial PANs (test BINs), live patterns of rates and balances.
Anonymization for stage: tokenization of identifiers, differential privacy for aggregates, removal of rare combinations.
Sids and determinism: one team - one state.
bash make db-reset && make db-seed ENV=sandbox SEED=2025_11_03
Time-travel: global "hour" of the environment for deadline/expiration tests.
3) Simulators and plugs (stubs)
Payments/Banks/PSP
Auth/Capture/Refund/Payout со сценариями: `approved`, `declined_insufficient`, `3ds_required`, `timeout`, `duplicate`.
PSP webhooks: HMAC signed, retrai, delays and "dirty internet."
KYC/AML/Sanctions
Ответы: `clear`, `pep_match`, `sanction_hit`, `doc_mismatch`, `manual_review`.
Support idempotency and rate limits as in prod.
Game Providers/Catalog
Lobby, feature, RTP/rounds - pseudo-random generation, controlled "payments/failures" for UX cases.
Option: simulator "severity" switch (happy-path vs chaos).
4) Webhooks in the sandbox
HMAC signatures (v1), headers' X-Event-Id ',' X-Timestamp ', window ≤ 5 minutes.
Retrays with exponential backoff, DLQ and replay.
Console "resend" and logs of attempts.
pseudo
POST /psp/webhooks
Headers: X-Signature, X-Timestamp, X-Event-Id
Body: { event_id, type, data, attempt }
5) Idempotence and determinism
All mutations accept 'Idempotency-Key'.
Simulators store the result by key (TTL 24-72 h).
"Seed determinism": with the same input - the same outcome (for repeatable tests).
6) Security and access
Network isolation/VPC, individual secrets and domains ('sandbox. example. com`).
RBAC/ABAC: the roles "partner," "qa," "dev," ospreys of tokens are minimal.
Rate-limits and quotas: fair share per-tenant/key, understandable '429 '/' Retry-After'.
Secrets only in KMS/Vault; regular rotation.
Prohibition of real payments at the code/config level (feature-flag hard block).
7) API Gateway and observability in sandbox
The same policies: OAuth2/OIDC/JWT, CORS, WAF, DDoS profile.
Metrics: p50/p95/p99, 4xx/5xx, hit-rate limits, latency webhooks, idempotent hits.
Logs/trails: no PII; correlation 'trace _ id'.
Dashboard "Sandbox Health": uptime, webhook queues, simulator errors.
8) Feature flags, versions and compatibility
Inclusion of features in sandbox → stage → prod.
SemVer for API; Deprecation/Sunset banner in Swagger/Redoc sandboxes.
Persisted queries for GraphQL storefronts (if any).
9) CI/CD и promotion
1. Build/Unit →
2. Contract/Mock tests (OpenAPI/Protobuf/GraphQL SDL) →
3. Integration vs. simulators →
4. Stage regression (anon. snapshots) →
5. Canary в prod.
Gate-checklist promotion: below in § 12.
10) UAT scripts for partners (sandboxed)
Payments: auth/capture/refund/payout with webhooks and PSP errors.
KYC/AML: all statuses + manual escalation.
Idempotency: repeated'Idempotency-Key '→ the same result.
Rate-limit: Correct handling of '429'.
Time windows: expiration of tokens, 'Retry-After', time-travel cases.
Webhooks: signatures/retrays/DLQ, manual replay and dedup.
11) Data policy and privacy
Never store real PAN/KYC docks in sandbox/stage.
Anonymization: masking, removal of direct identifiers, synthetic correlation.
TTL storage of logs and webhook bodies ≤ routine.
12) Checklists
12. 1 Launching a new sandbox
- Isolated Network/Base/Cache/Object Storage
- Secrets created in KMS/Vault, access by role
- PSP/KYC/game simulators are spelled out and versioned
- Swagger/Redoc + Postman collection (sandbox endpoints)
- Webhooks: HMAC, retry, DLQ, replay console
- Rate/Quota profiles, Deprecation/Sunset banners (if any)
- Dashboards and alerts (latency, 5xx, 429, DLQ)
12. 2 Promotion release (stage→prod)
- Contract diff checks (no breaking)
- Load p95/p99 normal at stage
- Webhooks underwent UAT, idempotency ok
- Feature flags are prepared, there is a rollback plan
- Changelog, migration guide and mailing to partners
13) Antipatterns
A sandbox that "secretly" touches prod services/databases.
Real card/passport data in stage/sandbox.
Simulators without webhooks/retreats are a "happy path" only.
No idempotence → duplicate payments/bets.
One common HMAC secret for all partners.
There are no limits or transparent 429/Retry-After.
14) Mini snippets
.env. sandbox (example)
dotenv
API_BASE=https://sandbox.api.example.com
OAUTH_ISS=https://sandbox.idp.example.com
PSP_SIM_URL=https://sandbox.psp-sim.example.com
KYC_SIM_URL=https://sandbox.kyc-sim.example.com
WEBHOOK_SECRET_ROTATION_DAYS=90
FEATURE_FORCE_SANDBOX_PAYMENTS=1
OpenAPI fragment (sandbox server)
yaml servers:
- url: https://sandbox.api.example.com/v1 description: Public Sandbox
Idempotency pseudocode
pseudo if store.exists(idem_key): return store.get(idem_key)
res = do_business()
store.set(idem_key, res, ttl=72h)
return res
PSP Simulator Triggers
json
{ "scenario": "payout", "case": "declined_insufficient", "payout_id": "p_123" }
15) Sandbox observability and SLO
Uptime sandbox API ≥ 99. 5% (the integration showcase should not fall).
Webhooks p95 ≤ 3 s to 2xx at normal load.
Error budget 5xx of gateway ≤ 0. 1%.
The docking portal is available and synchronized with the contract.
16) Governance
Environment owner (SRE/Platform) and steward API (contracts).
RFC process for breaking changes, Deprecation/Sunset calendar.
Separate limits/quotas and "fair-use" pricing for the public sandbox.
Resume Summary
The sandbox is a product for developers, not a "copy of the base." Give: strict isolation, synthetic data, full-fledged simulators with webhooks and retras, determinism through sides and time-travel, feature flags and transparent limits. Link everything with contracts, observability and governance - and your integrations will become fast, secure and predictable, and releases painless.