Test environments and staging

1) Purpose and area of responsibility

Test environments reduce the risk of releases by giving quick feedback and near-production conditions without impacting real players and money. For iGaming, this is critical due to payments (PSP), KYC/AML, responsible play (RG), and seasonal peaks.

2) Environment taxonomy

Dev (local/sandboxes): quick iterations of developers, minimal dependencies, ficheflags.
CI/Test (integration): assembly, unit/integration, contract tests, e2e on mocs.

Staging (pre-prod): maximum parity with sales (versions, configs, topology), "release rehearsal."

Perf/Load: isolated environment for load/stress tests so as not to interfere with functional checks.
Sec/Compliance Sandboxes: security checks, RG/PII policies, SoD.
DR/Failover Lab: accident scenarios and interregional failover.

Each environment has its own namespaces by: 'tenant/region/environment'.

3) Parity with sale (staging-first)

Configurations: GitOps, same circuits and validators; differences - only in values (keys/limits/endpoints).
Topology: same service versions, network policies, balancers, cache/database types.
Data: synthetic or obfuscated; no "raw" PIIs.
Telemetry: identical dashboards/alerts (only threshold levels and rate limits are different).

4) Data: Strategies and hygiene

Synthetic generators: realistic distributions for deposits/rates/CCS, pseudo-BINs, false documents.
Obfuscation of copies: one-way hashing of identifiers, CIPHER masking of sensitive fields.
Sitting: "scenario sets" (registratsiya→depozit→stavka→settl→vyvod) with deterministic IDs.
TTL and cleaning policies: auto-purging old data, volume limits.
Replay traffic (shadow): read without entries/side effects.

5) Service virtualization and external providers

PSP/KYC/CDN/WAF emulate contract mokes and variable responses (success, soft/hard decline, timeouts).
Contract tests (consumer-driven): fixing interfaces and examples.
Test doubles are switched by the flag: 'real' sandbox 'virtualized'.

6) Isolation and multi-tenancy

Namespace per tenant/region in k8s/config stores.
CPU/IO/Net quotas and limits so that one test does not crash the entire environment.
Ephemeral stands on the PR/feature branch: rise in minutes, live for hours/days, then disposed of.

7) CI/CD pipeline and gates

Поток: `build → unit → contract → integration → e2e (virtualized) → security scan → staging → canary → prod`.

Gates to go to staging:

green unit/contract, linters of circuits and configs;
risk class of changes (policy-as-code), freeze windows;
SLO gates staging (no red SLIs).

Gates for transition to prod:

successful "release rehearsal" (migrations, configs, phicheflags, alerts);
post-monitoring checklist;
4-eyes signatures on high-risk (PSP routing, RG limits, PII export).

8) Release rehearsals (staging drills)

DB/schema migrations: dry-run + reversibility (down migrations), time estimation.
Config release: canary steps, auto-rollback by SLI.
Ficheflags: inclusion on 5-25% of audience, guardrails check.
Status page/comm templates: processing messages (drafts without publishing outside).
Incident bot: bot commands to launch runbook actions as a training alarm.

9) Non-functional checks

Load/stress/endurance: profiles of real peaks (matches, tournaments), goals p95/p99, protection against overheating of queues.
Fault tolerance (chaos): network failures, drop replicas, timeouts of providers, partial feilover.
Security: DAST/SAST/IAST, secret scan, SoD check, authorization/audit regressions.
Compliance: KYC/AML/RG scenarios, export of reports to regulators, geo-boundaries of data.
Finance: correctness of the ledger in fractional/marginal cases, idempotency of payments/settles.

10) Observability of environments

The same SLI/SLO cards and alerts (levels are softer).
Synthetics repeats user paths: login, deposit, rate, output.
Exemplars/trace are available for RCA; logs without PII.
Drift detector: Git ↔ runtime (versions, configs, phicheflags).
Cost metrics: $/hour of environment, $/test, "heavy" dashboards.

11) Access, SoD and Security

RBAC/ABAC: access by role/tenant/region; production secrets are not available.
JIT rights for administration operations, mandatory audit.
Data policy: PII ban, obfuscation, geo-residency.
Network isolation: staging cannot write to external production systems.

12) Performance and cost (FinOps)

Ephemeral stands → auto-recycling; night shedulers turn off idle clusters.
Base layer sharing (Observability, CI cache), but test load isolation.
Catalog of "expensive" tests; concurrency limits; prioritization by QoS class.

13) Integrations (operational)

Incident bot: '/staging promote 'rollback', '/drill start ', rehearsal timelines.
Release-gates: release block with red SLO staging.
Feature-flags: general flag solution service, its own traffic segment.
Metrics API: same endpoints and metric directories, "medium badge" in responses.

14) Examples of artifacts

14. 1 Ephemeral Environment Manifesto on PR

yaml apiVersion: env. platform/v1 kind: EphemeralEnv metadata:
pr: 4217 tenant: brandA region: EU spec:
services: [api, payments, kyc, games]
dataSeed: "scenario:deposit-bet-withdraw"
virtualProviders: [psp, kyc]
ttl: "72h"
resources:
qos: B limits: { cpu: "8", memory: "16Gi" }

14. 2 Provider Directory (Virtualization)

yaml apiVersion: test. platform/v1 kind: ProviderMock metadata:
id: "psp. sandbox. v2"
spec:
scenarios:
- name: success rate: 0. 85
- name: soft_decline rate: 0. 1
- name: timeout rate: 0. 05 latency:
p95: "600ms"
p99: "1. 5s"

14. 3 Checklist "Release Rehearsal" (squeeze)

DB migrations: time, reversibility;

configs/ficheflags: diff, canary, SLO gates;

alerts/dashboards: tied, no flapping;

status drafts: ready;

reverse plan: 'T + 5m', 'T + 20m' metrics.

15) RACI and processes

Env Owner (SRE/Platform): parity, access, cost, dashboards.
Domain Owners: test scenarios, seating, contracts, KPI.
QA/SEC/Compliance: checks, reports, RG control.
Release Manager: gates, calendar, freeze/maintenance.
On-call/IC: participate in rehearsals of P1 scenarios.

16) KPI/KRI environments

Lead Time to Staging: kommit→staging, median.
Change Failure Rate (per staging): share of rollbacks to prod.
Parity Score: version/config/topology match (target ≥95%).
Test Coverage e2e by critical paths: login/deposit/rate/withdrawal.
Cost per Test / per Env Hour.
Drift Incidents: Git↔runtime discrepancies.
Security/Compliance Defects: found before Prod.

17) Implementation Roadmap (6-10 weeks)

Ned. 1-2: inventory of environments, GitOps catalog, configuration diagrams, basic data sets, provider contract tests.
Ned. 3-4: staging parity (versions/topology), ephemeral PR stands, PSP/KYC service virtualization, SLO gates.
Ned. 5-6: release rehearsals (checklists, bot teams), load profiles, chaos sets, environment dashboards.
Ned. 7-8: data policy (obfuscation/TTL), SoD/RBAC, FinOps shadowing, cost reports.
Ned. 9-10: DR/feiler-lab, compliance scripts, WORM audit, team training.

18) Antipatterns

Staging ≠ prod - other versions/configs/network rules.
Copying prod-PII into test → regulatory risks.
No virtualization of external providers → unstable/expensive tests.
Lack of SLO gates/rehearsals → surprises in the sale.
"Eternal" test data without TTL → garbage and false effects.
Joint load and functional checks in one stand.
Zero disposal at night/weekend → incineration of budget.

Total

Test environments and staging are a production quality infrastructure: parity with sales, clean data and virtual providers, strict CI/CD gates, release rehearsals, observability and FinOps. This framework reduces CFR and MTTR, increases release predictability, and protects iGaming platform revenue and compliance.

Test environments and staging

Total

Get in Touch

Quick Contact

The video will be updated soon

We are currently very busy with projects