API Gateway and Routing
1) API Gateway Role in Architecture
An API gateway is a single point of entry into microservices. Is he:- Routes requests (by path/headers/geo/weight/version).
- Protects the perimeter (TLS/mTLS, WAF, DDoS, rate limits, authN/Z).
- Controls traffic (canary/AB, shadow/mirror, circuit breaker, retras, timeouts).
- Standardizes protocols (REST/gRPC/WebSocket), headers, codes.
- Observes (logs, metrics, traces, correlation).
- Transforms and validates (JSON/XML, normalization, schema-validation).
For iGaming, it is also geo-compliance (country/age blocking), smart payment routing and responsible gaming policies on the edge.
2) Routing options
Path-based: `/api/v1/payments/ → payments-svc`.
Host-based: `eu. api. example. com → eu-edge`, `psp. example. com → psp-proxy`.
Header-based: 'X-Client: partner-A' → partner backend; 'Accept: application/grpc'.
Geo-routing: by IP/ASN/country (GDPR/local prohibitions, latency).
Weighted/Canary: '90%' on the old, '10%' on the new version; quick rollback.
Claim-routing: по `JWT. claims. tier/role/region '(for example, high-roller → premium limits).
Failover: asset-asset/asset-liability between data center/cloud and PSP.
3) Perimeter security
TLS everywhere: TLS 1. 2 + on the exterior, mTLS inside (shlyuz↔servisy).
OAuth2/JWT: signature verification, audit 'exp/nbf/aud/scope', JWKS rotation; validation cache with TTL.
HMAC: Body signature for webhooks/payments.
API keys: for system clients; associate with quotas/roles.
WAF: basic rules (injection, protocol anomalies), body size, deny list of countries.
DDoS protection: connection limiting, SYN cookies, rate-limit on IP/key/endpoint.
Zero-Trust: mandatory policies (SPIFFE/SPIRE, service identities), the principle of least rights.
Privacy: PII editing in logs, PAN/IBAN masking, storage policy.
4) Limits, quotas and protection against bursts
Модели: token bucket, leaky bucket, fixed/sliding window.
Borders: per-IP, per-key, per-user, per-route.
- Burst + sustained (e.g. '50 rps burst', '10 rps sustain').
- Retry-Budget and Slow-Loris protection (read timeouts).
- Quota by day/month for partners.
5) Transformations and validation
Normalize headers (trace-id, locale, client-id).
Request/Response-mapping.
Schema validation (OpenAPI/JSON Schema) before proxying - early 4xx failure.
Compression/' Accept-Encoding ', caching (see below).
6) Caching and performance
Edge cache for directories, public metadata, configs (TTL, 'ETag '/' If-None-Match').
Micro-cache 1-5 s for hot GET (reduces peak load).
Negative-cache short (at 404/empty) - careful.
Hedging-requests and competitive requests to replicas at p95> threshold.
7) Timeouts, retreats, resilience
Timeouts: connect/read/write separately; reasonable p95-landmarks.
Retrai: idempotent methods (GET/PUT) with backoff + jitter; retry-budget.
POST idempotency: 'Idempotency-Key' + service/gateway deduplication.
Circuit-Breaker: by errors/latency; half-open trial.
Bulkhead/Pool-isolation by upstream.
8) Versioning and compatibility
Methods:- URI: '/v1/... '(simple, but "noisy" routes).
- Header/Content-Negotiation: `Accept: application/vnd. app. v2+json`.
- Feature-flags/server capability - for minor-change compatibility.
Policy: SemVer, support window (for example, 'v1' = 12-18 months), depriction schedule, compatible responses for extensions (adding fields does not break).
9) Observability and quality control
Correlation: 'traceparent '/' x-request-id' required; we throw it down.
OpenTelemetry: RPS/p50/p95/p99/5xx/4xx, saturation, retry/circuit event metrics.
Logs: structural JSON; disguise PII; levels by code.
Trace sampling: basic 5-10% + target for errors/slow.
SLO/alerts: by routes/clients (uptime, latency, error).
10) Release Traffic Management
Blue-Green DNS/LB switch.
Canary: weight share/segments (region, partner, role).
Shadow/Mirror: copy of traffic to the new version without responding to the client.
Kill-switch: flag to quickly disable the problematic upstream/feature.
11) Smart payment routing (iGaming)
PSP selection rules: geo, currency, amount, risk rate, availability, commission.
Failover PSP: automatic transition at '5xx/timeout'.
Same-method rule: returns/outputs through the original method - check at the edge.
Payment ID: key on 'userId + amount + currency + purpose'.
ETA transparency: the gateway adds statuses and failure causes (not PSP codes).
12) Cross-region policies and compliance
Geo-filters: white/black lists of countries, age restrictions, IP ranges.
Resident data: routing to regional clusters (GDPR/local laws).
Logs and TTL: storage by region, automatic anonymization.
13) Configuration examples
13. 1 NGINX (routing + limit + headers)
nginx http {
map $http_x_request_id $req_id { default $request_id; }
limit_req_zone $binary_remote_addr zone=per_ip:10m rate=20r/s;
server {
listen 443 ssl http2;
server_name api. example. com;
Security add_header Strict-Transport-Security "max-age = 31536000" always;
add_header X-Content-Type-Options nosniff;
Limit on IP location/api/v1/{
limit_req zone=per_ip burst=40 nodelay;
proxy_set_header X-Request-Id $req_id;
proxy_set_header X-Client-Ip $remote_addr;
proxy_read_timeout 5s;
proxy_connect_timeout 1s;
proxy_pass http://payments_v1;
}
Canary traffic by header location/api/v2/{
if ($http_x_canary = "1") { proxy_pass http://payments_v2; }
proxy_pass http://payments_v1;
}
}
}
13. 2 Envoy (JWT, rate limit, retries, outlier)
yaml static_resources:
listeners:
- name: https address: { socket_address: { address: 0. 0. 0. 0, port_value: 443 } }
filter_chains:
- filters:
- name: envoy. filters. network. http_connection_manager typed_config:
"@type": type. googleapis. com/envoy. extensions. filters. network. http_connection_manager. v3. HttpConnectionManager route_config:
name: local_route virtual_hosts:
- name: payments domains: ["api. example. com"]
routes:
- match: { prefix: "/api/v1/payments" }
route:
cluster: payments_v1 timeout: 5s retry_policy:
retry_on: "connect-failure,refused-stream,5xx,retriable-status-codes"
num_retries: 2 per_try_timeout: 2s http_filters:
- name: envoy. filters. http. jwt_authn typed_config:
"@type": type. googleapis. com/envoy. extensions. filters. http. jwt_authn. v3. JwtAuthentication providers:
main:
issuer: "https://auth. example. com/"
remote_jwks: { http_uri: { uri: "https://auth. example. com/.well-known/jwks. json" } }
forward: true rules:
- match: { prefix: "/api/" }
requires: { provider_name: "main" }
- name: envoy. filters. http. ratelimit
- name: envoy. filters. http. router clusters:
- name: payments_v1 connect_timeout: 0. 5s type: STRICT_DNS lb_policy: ROUND_ROBIN load_assignment: { cluster_name: payments_v1, endpoints: [{ lb_endpoints: [{ endpoint: { address: { socket_address: { address: payments, port_value: 8080 }}}}]}] }
outlier_detection: { consecutive_5xx: 5, interval: 5s, base_ejection_time: 30s }
14) Checklists
Before route release
- Authentication scheme (JWT/JWKS, keys, TTL cache).
- Timeouts/Retrays/Idempotency are configured.
- Limits: per-IP, per-key, per-route; partner quotas.
- Validation of the request/response scheme.
- Logs and traces with 'trace-id', PII masks.
- SLO/alerts and dashboard.
- Geo-rules/compliance/age checked.
Transactions and Payments
- PSP Smart Routing: Rules, Priorities, Feilover.
- Same-method is checked at the edge.
- Transparent statuses and error codes for the customer (no raw PSP code).
Releases
- Canary/AB and kill-switch, rollback plan.
- Shadow traffic to new version, comparison of metrics.
- Load testing and p95 targets.
15) Quality metrics (minimum)
Availability/SLO by routes; error rate 5xx/4xx.
Latency p50/p95/p99 (external and internal).
Retry/timeout/circuit events (noise level).
Cache hit-ratio and RPS savings.
Rate-limit hits and dropped requests.
PSP-routing KPIs: successes, TtW, percentage of feilover, commission.
16) Anti-patterns
One total limit "for everything."
"Instant" retreats without jitter (storm intensification).
Trust'X-Forwarded-For 'without normalization and trusted proxy list.
Hard timeouts excluding p95 (false positives).
Hard transformations that break compatibility.
Logs with PII/PAN/secrets.
Mix internal and external API under the same domain/policy.
17) Response patterns and errors (microcopy)
429 Too Many Requests: "Request limit reached. Repeat in N seconds or increase the quota in the partner's office"
401/403: "The token is invalid/expired. Please sign in again"
408/504: "The service responds longer than expected. The request was not accepted"
Idempotency-conflict: "A request with this Idempotency-Key has already been processed (status: success/failure)."
18) Implementation process (steps)
1. Route model: domain/path/region map.
2. Security policies: TLS/mTLS, WAF, authN/Z, keys/JWKS.
3. Reliability: timeouts, retrays, idempotency, circuit-breaker.
4. Observability: logs/metrics/traces, correlation.
5. Cache/perf: edge/micro-cache, compression, connection pools.
6. Payment routing: rules, tests, monitoring.
7. Releases: canary/shadow, kill-switch, rollback plan.
8. Compliance/geo: country filters, data storage, age.
Final cheat sheet
Strict perimeter (TLS/mTLS, WAF, limits) + managed traffic (retrai, circuit, canary).
Validation and transformations at the edge of the → are less than defects "inside."
Observability with trace-id and PII masks is not an option, but a standard.
Smart payment routing and compliance geo are critical for iGaming.
Versioning and deprivation policy - predictability for partners.