Gateway API Architecture and Security
TL; DR
The API gateway is the only point of policy (authz, rate, transformation, audit) and the boundary of trust between the outside world and services. Success is given by: Zero-Trust (mTLS/JWT), policy-as-code, SLO-oriented traffic management and orthogonal observability. Build: edge gateway → BFF → service mesh; keep versioning and feature flags; Automate the protection of webhooks and keys test canary releases.
1) Roles and placement patterns
Edge/API Gateway (north-south): outer border. Termination TLS, WAF, DDoS, authN/Z, rate/quotas, CORS, transformations, cache, webhooks.
BFF (Backend-for-Frontend): customization layer for specific clients (web/mobile/partners). Schemes, aggregations, limits, response caching.
Internal Gateway (east-west )/Service Mesh Ingress: internal service-to-service authorization, mTLS, policy routing.
gRPC/REST/GraphQL gateway: single point of protocol translator and validator circuits.
Anti-patterns: "all through one monolithic gateway without isolation of environments," "hidden business logic in plugins," "manual rule management."
2) Trust model and authentication
TLS 1. 2+/1. 3 on the perimeter, HSTS on public domains; inside - mTLS between gateway and services.
OAuth2/OIDC: Authorization Code (PKCE) for customers; client-credentials for server integrations; JWT with short TTL and key rotation (JWKS).
HMAC signatures for partner integrations and webhooks (client key, SHA-256/512, timestamp verification and anti-replay).
API keys - only as an additional factor/for tracking; limit scope, IP, term.
- Separate authN (who) and authZ (what you can). Use attributes (scopes, roles, tenant, risk flags).
- All tokens are with aud/iss/exp/nbf; clock-skew ≤ 60s; mandatory kid and JWKS cache ≤ 5 min.
3) Authorization and policies (Zero-Trust)
ABAC/RBAC on gateway: rules over claims + request context (IP/ASN/geo/tenant).
Policy-as-Code (for example, OPA/Rego): storing rules in Git, CI validation, canary calculations.
Multi-lease: isolation by 'X-Tenant-Id', SSO at tenant-boundary; quotas/limits per tenant.
4) Traffic management and reliability
Rate limiting: leaky/token bucket, granularity: key/tenant/route/BIN/country (for payment APIs).
Quotas: day/month, separate for heavy operations (for example, reports).
Burst control and dynamic throttling based on load and SLO.
Circuit breaker: opening on errors/latency; outlier detection by upstream.
Retry with backoff+jitter; idempotency: key'Idempotency-Key '+ TTL window + result storage.
Timeouts: client <gateway <upstream; reasonable p95 reference points (e.g. 1. 5s/3s/5s).
Failover/Canary:% -routing (weighted), session-affinity optional, blue/green.
5) Transformations and validators
Schemes: OpenAPI/JSON Schema for REST; Protobuf for gRPC; SDL for GraphQL. Request/response validation on the gateway.
gRPC↔REST transposition, GraphQL federation (for BFF).
Header normalization (trace-ids, security headers), response filtering (PII edition).
CORS: whitelists, 'Vary' correct, ban 'on' Authorization'requests.
Compression и response caching (ETag/Cache-Control) для safe-GET.
6) Perimeter security
WAF: OWASP Top-10 rules, positive model for critical routs, virtual patches.
Bot protection: rate-based signatures, device fingerprint, protected captchas for public endpoints.
DDoS shield: upstream (cloud) + local limits; geo/ASN block lists.
CSP/Referrer-Policy/X-Frame-Options - if the gateway serves static/widgets.
WebSockets/SSE/WebTransport: separate limit and timeout profiles; auth-renewal by token.
7) Webhooks: Security and Delivery
Each recipient has their own secret; signature 'HMAC (signature, timestamp' path 'body)'; valid time window (for example, 5 minutes).
Idempotence at the reception: dedup by 'event _ id'.
Retrai: exponential, maximum N; status-endpoint for hand-shake.
mTLS/Allow-list IP; Ability to replay on demand with restrictions
8) Observability and audit
Logs: do not log secrets/PAN/PII; correlate by 'trace _ id '/' span _ id'; masking.
Metrics: RPS, error rate by class, latency p50/p95/p99, open circuits, retry rate, 4xx vs 5xx, saturation.
Trails: W3C Trace Context; throw 'traceparent '/' tracestate' into the upstream.
Audit: separate "who and what called/changed" stream, unchangeable storage; policy events (access-denied, quota-hit).
9) Secrets and cryptography
Key storage: KMS/Vault, rotation every 90 days (or more often), separate read roles.
Certificates: automatic issue/update (ACME), pinning for mobile (TOFU/HPKP-like caution).
JWKS rotation: two active keys (old/new), clear roll windows.
Cryptoprofiles TLS: ECDHE preference, prohibition of vulnerable ciphers/protocols.
10) Compliance and data
PCI DSS: PAN-safe streams, tokenization; never proxy raw-PAN through plugins.
GDPR/DSAR: region/tenant routing, data residency, delete/anonymize.
PII exposure limit: filtering fields on the gateway, encrypting sensitive headers.
11) Topologies and multi-regionality
Self-managed vs Managed (Envoy/Kong/NGINX vs Cloud API Gateway). For strict control/PCl - more often self-managed.
Multi-AZ/Multi-Region Active-Active: global DNS/GSLB, health-based and geo-routing, per-region secret-stores.
DR plan: RPO/RTO, cold/warm standby gateway with a policy blue.
12) API versioning and evolution
Strategies: URI vN, header-versioning, content-negotiation. For public - a clear deprecation policy (≥6 -12 months).
Backward-compat: extend schemes by adding optional fields; contracts in Git, OpenAPI linters.
Canary/Shadow: traffic run in the "shadow" of the new version, comparison of answers.
13) Performance and cache
Cache on edge for GET/idempotent requests; conditions: correct ETag/Cache-Control.
Connection pooling to upstream; HTTP/2 keep on; for gRPC - maximum benefit.
Payload budgets-Constrains the size of bodies gzip/br.
Pre-compute BFF responses for high frequency panels/directories.
14) Configuration management
GitOps: declarative manifests of routes/policies; review/CI (lint, security scan); CD with canary parties.
Feature flags on the gateway: a fast route/rule switch without deploy.
Templates for repeating policies (OIDC, rate, CORS).
15) Mini snippets (pseudo)
Idempotency (Kong/Envoy-style):yaml plugins:
- name: idempotency config:
header: Idempotency-Key ttl: 24h storage: redis
Rate/Quota:
yaml
- name: rate-limiting config: {policy: local, minute: 600, key: consumer_id}
- name: response-ratelimiting config: {limits: {"heavy": {minute: 60}}, key: route_id}
JWT/OIDC:
yaml
- name: oauth2-introspection config:
jwks_uri: https://idp/.well-known/jwks. json required_scopes: ["payments:write","payments:read"]
WAF (profile):
yaml
- name: waf config:
mode: block ruleset: owasp_crs exclusions: ["/health", "/metrics"]
Webhook signature:
pseudo sig = HMAC_SHA256(secret, timestamp + "\n" + method + "\n" + path + "\n" + sha256(body))
assert now - timestamp < 300s
16) NFR and SLO for gateway
Uptime (month): ≥ 99. 95% (edge), ≥ 99. 9% (internal).
Latency p95: ≤ 50-100 ms upstream additives.
Error budget: ≤ 0. 05% 5xx from gateway (excluding upstream).
Security policies: 100% of requests with TLS; 0 secret leak incidents; MTTR vulnerability WAF rules ≤ 24h.
17) Implementation checklist
- Architectural map: edge → BFF → mesh, list of domains/routs.
- TLS/mTLS, JWKS rotation, secrets in KMS/Vault.
- OAuth2/OIDC, scopes/claims, ABAC/OPA.
- Rate/quotas, circuit-breaker, retry/backoff, idempotency.
- OpenAPI/JSON Schema validators, gRPC/REST/GraphQL transformations.
- WAF/DDoS/bot profile, CORS/CSP.
- Webhook security: HMAC, anti-replay, allow-list.
- Logs/metrics/trails; access/change audit.
- GitOps/policy-as-code; canary calculations; DR plan.
- PCI/GDPR control: masking, retentions, DSAR procedures.
18) Frequent errors
Storing secrets in the gateway configuration/logs.
Global "in CORS/trust all 'Origin'.
Lack of idempotence and fair timeouts → doubles and avalanches.
Mixing authN and business logic in gateway plugins.
There is no JWKS rotation and kid → "stuck" keys.
Observability without trace correlation → blinded RCA.
Summary
The Gateway API is not just a reverse proxy, but a policy and security platform that supports performance, compliance, and monetization. Build Zero-Trust, fix contracts with schemes, manage traffic through SLO, automate configurations through GitOps and policy-as-code. Then the gateway will become a stable "edge" of your architecture, and not a narrow neck.