GH GambleHub

Key management and rotation

Keys are the platform's "trust roots." A reliable key management system (KMS/HSM + processes + telemetry) turns cryptography from a one-time integration into an everyday operation: keys are regularly updated, their use is transparent, compromises are localized, and clients experience a key change without downtime.

1) Goals and principles

Crypto agility: the ability to change the algorithm/key length without large migrations.
Least exposure: private keys do not leave KMS/HSM; signature/decryption operations - deleted.
Short-lived artifacts: Tokens/session keys live minutes-hours, not weeks.
Dual-key/Dual-cert windows: fail-safe rotations.
Regional & tenant isolation: keys are divided by region and tenant.
Full auditability: immutable transaction log, HSM qualification, access control.

2) Key classification

Root CA/Master Key: extremely rare use, kept in HSM, used to release intermediate keys or data-key wrappers.
Operating: JWT/event signature, TLS, webhook signature, config encryption/PII.
Session/time: DPoP, mTLS-binding, ECDH-output for channel/dialogue.
Integration: Partner keys (public) and HMAC secrets.
Data Keys (DEK): use envelope encryption under KEK, are not stored explicitly.

3) Key identification and usage policy

Each key has a 'kid' (the key is identified in tokens/headers):
yaml key:
kid: "eu-core-es256-2025-10"
alg: "ES256"         # или EdDSA, RSA-PSS, AES-GCM, XChaCha20-Poly1305 purpose: ["jwt-sign","webhook-sign"]
scope: ["tenant:brand_eu","region:EE"]
status: "active"       # active      next      retiring      revoked created_at: "2025-10-15T08:00:00Z"
valid_to:  "2026-01-15T08:00:00Z"

Rules: "one goal - one key" (minimum sharing), explicit areas of application and timing.

4) Key lifecycle (KMS/HSM)

1. Generate: in HSM/KMS, with export policy = denied.
2. Publish: for asymmetry - JWKS/certificate with 'kid'.
3. Use: remote operations (sign/decrypt) with controlled IAM.
4. Rotate: run'next 'key and enable dual-accept.
5. Retire: translate the old into 'retiring', then 'revoked'.
6. Destroy: destroy material (with purge protocol) after dispute window.

5) Rotation: Strategies

Scheduled: calendar (for example, every 1-3 months for JWT signature, 6-12 months for TLS-serts).
Rolling: gradually switching consumers (JWKS already contains a new key; the emitter begins to sign new after warming up the caches).
Forced (security): immediate rotation upon compromise; short dual-accept window, aggressive expiration of artifacts.
Staggered per region/tenant: so as not to "clap" the whole world at the same time.

The golden rule: first the publication, then the signature is new, and only after the expiration - the recall of the old one.

6) Dual-key window

We publish JWKS with the old and new 'kid'.
Verifiers accept both.
Emitter in N minutes/hours starts signing new.
We monitor the share of checks on the old/new 'kid'.
Upon reaching the target share, the retyrim is old.

yaml jwks:
keys:
- kid: "eu-core-es256-2025-10" # new alg: "ES256"
use: "sig"
crv: "P-256"
x: "<...>"; y: "<...>"
- kid: "eu-core-es256-2025-07" # old alg: "ES256"
use: "sig"
...

7) Signature and validation policies

Default algorithms: signature ES256/EdDSA; RSA-PSS where required.
Prohibition of'none '/weak algorithms; whitelisting on the verification side.
Clock skew: we allow ± 300 c, log deviations.
Key pinning (internal services) and a short TTL JWKS cache (30-60 s).

8) Envelope encryption and KDF

Store data like this:

ciphertext = AEAD_Encrypt(DEK, plaintext, AAD=tenant    region    table    row_id)
DEK = KMS. Decrypt (KEK, EncryptedDEK )//on access
EncryptedDEK = KMS. Encrypt (KEK, DEK )//on write

KEK (Key Encryption Key) is stored in KMS/HSM, rotated regularly.
DEK is created per object/batch; when rotating KEK, we perform re-wrap (quickly, without data re-encryption).
For streams - ECDH + HKDF to output short-lived channel keys.

9) Regionality and multi-tenant

Keys and JWKS are regionalized: 'eu-core', 'latam-core' are different sets of keys.
Separation of IAM/audit by tenant/region; keys do not "flow" between residences.
'kid'code with trust domain prefix:' eu-core-es256-2025-10 '.

10) Integration secrets (HMAC, API keys)

Store in the KMS-backed Secret Store, issue via short-lived client secrets (rotation policy ≤ 90 days).
Support for two active secrets (dual-secret) during rotation.
For webhooks - timestamp + HMAC body signature; time window ≤ 5 min.

11) Access control and processes

IAM matrix: who can 'generate', 'sign', 'decrypt', 'rotate', 'destroy' (minimum roles).
4-eye principle: sensitive operations require two confirmations.
Change windows: windows for enabling a new key and test canary regions.
Runbooks: procedure templates for scheduled and forced rotations.

12) Observability and audit

Metrics:
  • `sign_p95_ms`, `decrypt_p95_ms`, `jwks_skew_ms`,
  • consumption by 'kid', 'old _ kid _ usage _ ratio',
  • `invalid_signature_rate`, `decrypt_failure_rate`.
Logs/Audit:
  • Each signature/decryption operation is' who/what/when/where/kid/purpose '.
  • History of key status and rotation/revocation requests.
  • HSM qualification, key materials access logs.

13) Playbooks (incidents)

1. Signature key compromise

Immediate revoke of the old 'kid' (or translation into 'retiring' with a minimal window), publication of a new JWKS, shortened TTL tokens, force logout/RT disability, communications to integration owners, retro audit.

2. Mass'INVALID _ SIGNATURE'after rotation

Check JWKS/clock skew cache, return dual-accept, extend window, distribute to clients.

3. Increase in KMS/HSM latency

Enabling the local signature cache is not allowed; instead - batch/queue at the emitter, autoscaling HSM proxy, prioritization of critical streams.

4. Failure of one region

Activate regional isolation procedures; do not "pull" keys from other regions; degrade functions tied to signatures in a fallen region.

14) Testing

Contract: JWKS correctness, correct 'kid '/alg/use, client compatibility.
Negative: fake signature, obsolete 'kid', incorrect alg, clock skew.
Chaos: instant rotation, KMS unavailability, time drift.
Load: peak signatures (JWT/webhooks), peak decryptions (PII/payouts).
E2E: dual-key window: release - verification - traffic transfer - rejection of the old one.

15) Configuration Example (YAML)

yaml crypto:
regions:
- id: "eu-core"
jwks_url: "https://sts. eu/.well-known/jwks. json"
rotation:
jwt_sign: { interval_days: 30, window_dual: "48h" }
webhook: { interval_days: 60, window_dual: "72h" }
kek:   { interval_days: 90, action: "rewrap" }
alg_policy:
sign: ["ES256","EdDSA"]
tls: ["TLS1. 2+","ECDSA_P256"]
publish:
jwks_cache_ttl: "60s"
audit:
hsm_attestation_required: true two_person_rule: true

16) Example of JWKS and markers in artifacts

JWT header fragment:
json
{ "alg":"ES256", "kid":"eu-core-es256-2025-10", "typ":"JWT" }
JWKS (public part):
json
{ "keys":[
{"kty":"EC","use":"sig","crv":"P-256","kid":"eu-core-es256-2025-10","x":"...","y":"..."},
{"kty":"EC","use":"sig","crv":"P-256","kid":"eu-core-es256-2025-07","x":"...","y":"..."}
]}

17) Anti-patterns

Long-lived keys "for years" and common to all regions.
Rotation "at one moment" without dual-accept.

Export private keys from KMS/HSM "for speed."

Mixing tasks: sign JWT and encrypt data with one key.
Absence of HSM logs/qualification and IAM restrictions.
There is no re-wrap mechanism for DEK in KEK rotation.
Manual "secrets" in env instead of Secret Store.

18) Pre-sale checklist

  • All private keys in KMS/HSM; The IAM matrix and the 4-eye principle are tuned.
  • Algorithm policies, key lengths, and lifetimes are approved.
  • Enabled dual-key process with 'kid' share monitoring.
  • JWKS is published with short TTL and cache warming; clients accept key ≥2.
  • Envelope encryption: KEK rotates, DEK re-wrap without downtime.
  • Regional isolation and separate key sets by tenants.
  • Compromise/rolling/force rotation playbooks; training runs.
  • Metrics ('old _ kid _ usage _ ratio', 'invalid _ signature _ rate') and alerts are enabled.
  • contract/negative/chaos/load/E2E test suite passed.
  • Documentation for integrations: how to handle the'kid' shift, which windows and error codes.

Conclusion

Key management is an operational discipline: KMS/HSM as a source of truth, regular and secure rotations with dual-key, regional and tenant isolation, envelope encryption and observability. By following these rules, you get a crypto contour that scales, is incident-resistant and easy to explain to the auditor - and developers and integrators experience any change without pain.

Contact

Get in Touch

Reach out with any questions or support needs.We are always ready to help!

Start Integration

Email is required. Telegram or WhatsApp — optional.

Your Name optional
Email optional
Subject optional
Message optional
Telegram optional
@
If you include Telegram — we will reply there as well, in addition to Email.
WhatsApp optional
Format: +country code and number (e.g., +380XXXXXXXXX).

By clicking this button, you agree to data processing.