API Gateway and Routing
1) API Gateway Role in Architecture
API gateway - L7 component on the edge, which:- Accepts inbound traffic (HTTP/HTTP2/HTTP3, WebSocket, gRPC)
- routes according to rules (host/path/headers/method/query/geo/weight/health);
- applies end-to-end policies: authentication/authorization, rate limiting, WAF, CORS, caching;
- performs transformations (normalization of headers/bodies, gRPC↔JSON, GraphQL stitching);
- provides stability (timeouts, retries, circuit-breaker, outlier detection);
- gives observability and billing (logs, metrics, traces, quotas);
- isolates the internal topology (service mesh, private services).
Often used in pairs: Edge/API-Gateway + Ingress/Mesh (Envoy/Istio/Linkerd) - the first decides foreign policy, the second - east-west.
2) Typical topologies
Single global gateway (CDN/edge POP → L7 gateway → services) - simple, centralized policies.
Regional gateways (per-region) + smart geo/latency routing.
Multi-tenant: dedicated routes/subdomains/keys, quotas and limits per tenant.
Hybrid: on-prem + cloud, private link/peering, private backends behind the API gateway.
3) L7 Routing Rules
Criteria:- Host/Path: `api. example. com` → `/v1/orders/`.
- Headers: `X-Client`, `X-Region`, `User-Agent`, `Accept`.
- Method/Content-Type: distinguishing JSON/Proto/GraphQL.
- Query/Fragment: careful - affects cache/variants.
- Geo/Latency: nearest POP/region, failover under degradation.
- Weighted/Canary: traffic distribution 90/10, 50/50, sticky by cookie.
- Session affinity: hash-based on key/token (careful when scaling).
yaml nginx. ingress. kubernetes. io/canary: "true"
nginx. ingress. kubernetes. io/canary-weight: "10" # 10% traffic to new backend
Example (Envoy, header-based routing):
yaml match: { prefix: "/orders", headers: [{name: "X-Experiment", exact_match: "new"}] }
route: { cluster: orders-v2 }
4) Protocols and compatibility
REST/JSON - default, describe OpenAPI for client validation/generation.
gRPC - binary Proto over HTTP/2; for external clients, use gRPC-JSON transcoding.
GraphQL - aggregates services; on the perimeter, monitor the complexity/depth of queries.
WebSocket/SSE - bidirectional/push; consider sticky and timeouts.
HTTP/2/3 (QUIC) - multiplexing/fast start; Verify WAF/proxy compatibility.
5) Security: authentication and authorization
5. 1 Transport
TLS 1. 2 + on perimeter, HSTS, OCSP stapling, PFS.
mTLS for B2B/internal API and machine-to-machine.
IP allowlist/denylist, geo-constraints.
5. 2 Application layer
OAuth2/OIDC: JWT bearer tokens, signature/expiration/audience verification.
NMAS/signatures: date + canonized line + signature (AWS-like) - protection against substitution, repetition (nonce/time window).
API keys: as identifier only; rights - through RBAC/ABAC/scopes.
CORS: explicit allow-origin, pre-flight cache.
WAF: signatures (OWASP API Top 10), anomaly, bot protection, recursive JSON fields.
DDoS/Abuse: connection limiting, token-bucket/Leaky bucket, birst + average speed, dynamic bans.
yaml plugins:
- name: oidc config: { issuer: "...", client_id: "...", client_secret: "...", scopes: ["orders. read"] }
- name: rate-limiting config: { minute: 600, policy: local }
6) Validation, transformation and compatibility
Schemes: validation of body/headers/parameters according to OpenAPI/JSON-Schema/Protobuf.
Transformations: field normalization, PII masking, addition of correlation headers ('traceparent', 'x-request-id').
Versioning: 'Header: X-API-Version', prefixes '/v1 ', resource-versioning; deprecation policy и Sunset.
Backward-compat: add-field only; avoid "breaking" changes without a new version.
Idempotency: `Idempotency-Key` для POST; gateway stores keys in Redis with TTL.
7) Resilience: Connection policies
Timeouts: connect/read/write; reasonable defaults (e.g. 1s/5s/5s).
Retries: only for safe and idempotent; jitter, exponential backoff, maximum attempts.
Circuit breaker: open on errors/latency; half-open for samples.
Outlier detection - remove bad instances from the pool.
Bulkhead/competition: limits on concurrent per-route requests.
Failover: active/passive, zonal degradation.
Shadow traffic: V2 "gray" run parallel to V1 (no effect on response) for comparison.
yaml circuit_breakers:
thresholds:
- priority: DEFAULT max_connections: 1024 max_pending_requests: 2048 max_retries: 3
8) Caching and performance
HTTP-кеш: `Cache-Control`, `ETag/If-None-Match`, `Vary`, `stale-while-revalidate`.
Edge caches/POP: CDNs for static and cached APIs (idempotent GETs).
Compression: 'gzip/br' (do not compress already compressed).
Request collapsing ("coalescing"): combining identical parallel requests.
Response shaping: fields/filters, cursor-based, size limits.
9) Observability and operation
Метрики: `l7_req_total{route,method,code}`, `latency_ms{p50,p95,p99}`, `upstream_errors`, `retry_count`, `cb_state`, `429_rate`, `quota_usage{tenant}`.
Logs: structural, with 'trace _ id/span _ id', 'user _ id/tenant _ id', 'client _ ip'.
Traces: W3C Trace Context ('traceparent', 'tracestate'), propagate to upstream.
Audit: who caused what, with what rights; immutable stores for sensitive APIs.
SLO/SLA: target p99, error budget; the root level is better than the global one.
10) Capacity plan management
Quota per-tenant/key/customer pool, in min/hour/day.
Burst + sustained limits; leaky bucket for smoothing.
Fairness: when overloaded - fair queuing instead of "first encountered."
Priorities: system/critical routes with priority and dedicated pools.
11) Change Management and Releases
Canary/Blue-Green: weight routing; automatic advance on SLO (errors/latency).
Feature gates/backend flags: enable by header/token.
Shadowing/diff validators: comparison of bodies/statuses, delta tolerances.
Staging: allocated domains/paths ('staging. api... '), individual keys and quotas.
12) Configuration examples
12. 1 NGINX - Basic Limit and Cache Gateway
nginx map $http_x_request_id $reqid { default $request_id; }
limit_req_zone $binary_remote_addr zone=perip:10m rate=10r/s;
server {
listen 443 ssl http2;
server_name api. example. com;
security add_header Strict-Transport-Security "max-age = 31536000" always;
location /v1/ {
limit_req zone=perip burst=30 nodelay;
proxy_set_header X-Request-ID $reqid;
proxy_set_header Authorization $http_authorization;
proxy_connect_timeout 1s;
proxy_read_timeout 5s;
proxy_cache api_cache;
proxy_cache_valid 200 10s;
proxy_cache_use_stale error timeout updating;
proxy_pass http://orders-v1;
}
}
12. 2 Envoy - Balance and Retray Routing
yaml routes:
- match: { prefix: "/orders" }
route:
weighted_clusters:
clusters:
- name: orders-v1 weight: 90
- name: orders-v2 weight: 10 retry_policy:
retry_on: "5xx,reset,connect-failure"
num_retries: 2 per_try_timeout: 2s
12. 3 Traefik - middleware and headers
yaml http:
middlewares:
secHeaders:
headers:
stsSeconds: 31536000 contentTypeNosniff: true routers:
api:
rule: "Host(`api. example. com`) && PathPrefix(`/v1`)"
service: svc-orders middlewares: ["secHeaders"]
13) Anti-patterns
One global limit on all - "good neighbors" suffer because of "noisy."
Retrays without idempotency → duplicate effects (payments, creating entities).
Ignoring 'timeout '/' max body size' → hangs/exhausts workers.
Mixing edge policies and business logic in the gateway (weighting the perimeter).
Lack of validation of schemes → fragility of clients and "breaking" releases.
Naked WebSocket excluding auth/limits/idle-time.
Secrets in headlines without rotation; no mTLS in internal B2Bs.
14) Test playbooks (Game Days)
Storm of requests: check limiter/quota, 429-behavior, degradation.
Loss of one cluster: failover/weight redistribution; SLO canaries.
Weighted answers: max body/timeouts; cutting off joints.
Injections/anomalies: WAF rules, recursive JSON inhibition, large GraphQL depths.
Trace failed to check 'traceparent' propagation and sampling.
Secrets: key rotation/JWKS, token expiration, clock-skew tolerance.
15) Implementation checklist
- Domains/paths/versions defined, OpenAPI/Proto published.
- TLS/mTLS, HSTS, secret management and rotation are configured.
- Authentication (OIDC/HMAC), RBAC/scopes, CORS enabled.
- Limits/quotas per-tenant, fair queues, 429-UX.
- Weight/header routing, canary plan and rollback.
- timeout/retry/circuit-breaker/outlier policies.
- Scheme validation, transformations, PII masking.
- Edge-кеш/ETag, coalescing, gzip/br.
- Observability: metrics, logs, tracks, dashboards and alerts.
- Runbooks: incidents, key rotation, block lists, Black Friday.
16) FAQ
Q: How does the API gateway differ from the service mesh?
A: Gateway - north-south (outer perimeter, end-to-end policies). Mesh - east-west (intracluster connectivity/MTLS/retrai). Often used together.
Q: Where to implement auth: in the gateway or services?
A: Both levels: gateway - coarse-grained (authentication, basic rights/quotas), service - fine-grained (domain roles/attributes).
Q: When is gRPC-JSON transcoding needed?
A: When internal gRPC, and outside requires REST/JSON and simple clients/browsers.
Q: How to choose a versioning strategy?
A: For public APIs - path '/vN '+ deprivation headers and long overlap. For internal - capability-flags/compatibility scheme.
17) Totals
The API gateway is not just a proxy, but a center of policies and resilience. Proper routing, security, limits, validation and observability give predictability and speed of releases. Combine edge gateway with service mesh, automate canaries and quotas, test failures - and the perimeter will become your accelerator, not a bottleneck.