GH GambleHub

gRPC: binary protocols and performance

TL; DR

gRPC = HTTP/2 + Protobuf + strict contracts + streaming. It gives low latency, efficient traffic and stable contracts between services. Ideal for internal north-south/east-west calls, realtime channels (server/client/bidi streaming), as well as a mobile front via gRPC-Web. Success is provided by: small proto-contracts, deadlines and cancellations, exponential retrays with idempotency, connection pooling, Envoy at the edge, mTLS, key encryption and full observability.


1) When to choose gRPC and when not

Suitable for:
  • Internal APIs between microservices (balance, limits, calculation, anti-fraud).
  • High-frequency queries with strict SLOs by p95/p99.
  • Long-lived streams (tables/tournaments, live events, payout statuses).
  • Mobile clients (via gRPC-Web or BFF).
Leave REST/GraphQL for:
  • Public integrations, webhooks, payment teams with tough idempotency and CDN caches.
  • Admin UIs with rich aggregation sampling (GraphQL-BFF over gRPC).

2) Contracts and Evolution (Protobuf)

Principles of the scheme: we only add fields, do not reuse numbers; mandatory - through validation, not 'required'.
Versioning: packages/namespace ('payments. v1`, `payments. v2`); deprecate via'deprecated = true'and migration windows.
Semantics: "thin" messages without arrays of hundreds of KB; large samples - stream or pagination.

Example (simplified):
proto syntax = "proto3";
package payments.v1;

service Payouts {
rpc Create (CreatePayoutRequest) returns (CreatePayoutResponse) {}
rpc GetStatus (GetStatusRequest) returns (GetStatusResponse) {}
rpc StreamStatuses (StreamStatusesRequest) returns (stream StatusEvent) {}
}

message CreatePayoutRequest {
string idempotency_key = 1;
string user_id = 2;
string currency = 3;
int64 amount_minor = 4; // cents
}

message CreatePayoutResponse { string payout_id = 1; }
message GetStatusRequest { string payout_id = 1; }
message GetStatusResponse { string state = 1; string reason = 2; }
message StreamStatusesRequest { repeated string payout_ids = 1; }
message StatusEvent { string payout_id = 1; string state = 2; int64 ts_ms = 3; }

3) Transport and connections

HTTP/2 multiplexes many RPCs into one TCP connection: keep long-lived channels with connection pooling (on a client, 2-4 channels/target upstream is usually enough).
Keepalive: send pings less often than balancer timeouts (for example, every 30 seconds), limit 'max _ pings _ without _ data'.
Flow control/backpressure: HTTP/2 window settings + client/server queue boundaries.


4) Performance: what really affects

Message sizes: target - ≤ 64-128 KB; Enable gzip/brotli for big answers for huge payload - stream.
Protobuf serialization is 5-10 × more compact than JSON; avoid'string 'for numbers and'map <string, string>' where possible.
CPU/allocs: profile codec and resolvers; use "zero-copy" buffers and pre-allocate.
Threading: gRPC servers are sensitive to locks - bring I/O to async, put deadline on external databases.
Nagle/Delayed ACK: usually leave by default; experiment carefully.


5) Deadlines, cancellations, retreats, idempotence

Always set the'deadline' on the client (p95 upstream × 2), throw the context into the services/database.
If canceled on the client, the server must interrupt and free resources.
Retrai: only for idempotent operations (GET analogs, status, stream reading). For changers, use the'idempotency _ key'key and store the result.
Backoff policy is exponential with jitter; the limit of attempts and the "retray buffer" on the client.
gRPC status codes: use 'DEADLINE _ EXCEEDED', 'UNAVAILABLE' (retracted), 'FAILED _ PRECONDITION', 'ALREADY _ EXISTS', 'ABORTED', etc. - slim semantics saves nerves.


6) Streams: server, client, bidi

Server streaming for long responses and feeds (check for memory leaks when the client is slow).
Client streaming - downloads/batches.
Bidirectional - interactive (live-tables, internal-events).
Add sequence/offset in messages for ordering and resume at the application level (gRPC alone does not provide replay after reconnection).


7) Balancing and topology

xDS/Envoy as data-plane: L7-balancing, circuit-breaking, outlier-ejection.
Consistent hash (by 'user _ id '/' table _ id') - keeps hot keys on one upstream, reduces cross-node locks.
Hedging/mirroring: careful; helps for p99 tails but increases load.
Multi-region: local end-points with geo-routing; pin-ning "home region" by session.

Example Envoy (fragment):
yaml load_assignment:
endpoints:
- lb_endpoints:
- endpoint: { address: { socket_address: { address: svc-a-1, port_value: 8080 } } }
- endpoint: { address: { socket_address: { address: svc-a-2, port_value: 8080 } } }
outlier_detection:
consecutive_5xx: 5 interval: 5s base_ejection_time: 30s circuit_breakers:
thresholds:
max_connections: 1024 max_requests: 10000

8) Safety

mTLS between all hops (gateway ↔ services); short TTL certificates, automatic rotation (ACME/mesh).
AuthZ: JWT/OIDC on the edge, laying claims to services; ABAC/RBAC at gateway/mesh level.
PII/PCI: filtering fields, disabling logging sensitive data; token encryption in transit/at rest.
gRPC-Web: the same auth principles, but shells through the HTTP/1. 1 (Proxy Envoy).


9) Observability

Metrics: rps, p50/p95/p99 latency per method, error rate by code, active streams, message size, thread/pool saturation.
Tracing: W3C/' traceparent 'in metadata; spans on the client and server propagate context to database/cache.
Logs: correlation by 'trace _ id', sampling, strict disguise.
Helschecks: separate 'Health' service ('grpc. health. v1. Health/Check ') and' Watch 'for stream health.


10) Compression, limits and protection

Enable message compression (per-call), limit 'max _ receive _ message _ length '/' max _ send _ message _ length'.
Rate/Quota at the gateway level; circuit-breaker by error/latency.
Deadline budget: Do not cling to infinitely long deadlines between hops - each link cuts its budget.
Protection against "expensive" requests: limit the size/number of elements in the message, interrupt long streams.


11) Gateways and interoperability

gRPC-Gateway/Transcoding: export part of methods as REST (for partners/admins).
gRPC-Web: front directly to Envoy, which is transcoded.
GraphQL-BFF: resolvers can walk in gRPC; for payment domain mutations, REST with idempotency is preferred.


12) Idempotency in modifying operations

Template:
  • The client generates'idempotency _ key '.
  • The server saves the result by key to TTL (for example, 24 hours).
  • Repeated'Create 'with the same key returns the same' payout _ id '/status.
Pseudo:
go if exists(key) { return storedResult }
res:= doBusiness()
store(key, res)
return res

13) Errors and status mapping

Local domain errors → 'status. WithDetails` (google. rpc. ErrorInfo) with codes:
  • 'INVALID _ ARGUMENT '(validation),' NOT _ FOUND ',' ALREADY _ EXISTS ',
  • 'FAILED _ PRECONDITION ',' ABORTED ',
  • `UNAUTHENTICATED`/`PERMISSION_DENIED`,
  • 'RESOURCE _ EXHAUSTED '(quotas/limits),
  • 'UNAVAILABLE '(network/upstream),' DEADLINE _ EXCEEDED '.
  • For the client: only retract 'UNAVAILABLE', 'DEADLINE _ EXCEEDED' and cases marked with idempotent.

14) Testing and UAT

Contract tests by '.proto' (golden files).
Load: p50/p95/p99 latency, throughput, CPU, memory, GC.
Streams: backpressure tests, interrupts, resume.
Networks: loss/jitter emulation; timeouts/hedging tests.
Security: mutators of tokens/serts, rota keys in runtime.

Checklist:
  • Deadline on each client call.
[The] Retreats are only where idempotent.
  • Message size limits.
  • Health/Watch and alerts on p95/p99.
  • mTLS and rotation.
  • End-to-end tracing.
  • Envoy circuit-breaking и outlier-ejection.
  • gRPC-Web e2e for browser (if needed).

15) Anti-patterns

Giant messages instead of streams.
Endless deadlines and no cancellation.
Retrays of unsafe mutations are duplicates.
Without connection pooling - storm of connections.
Absence of health/watch - "blind" failures.
Laying PII in trails/logs.
Monolithic one endpoint pool for the whole world - without regional proximity.


16) NFT/SLO (landmarks)

Edge→Service additive: ≤ 10-30 ms p95 within the region.
Method latency: p95 ≤ 150-250 ms (business operations), p99 ≤ 500 ms.
Error rate (5xx/`UNAVAILABLE`): ≤ 0. 1% of RPS.
Uptime: ≥ 99. 95% for critical services.
Streams: connection retention ≥ 24 hours, drop-rate <0. 01 %/hour.


17) Mini-Specs and Sample Configurations

Client deadline/retray (pseudo Go):
go ctx, cancel:= context.WithTimeout(ctx, 300time.Millisecond)
defer cancel()
resp, err:= cli.GetStatus(ctx, req, grpc.WaitForReady(true))
Retray policy (Java, YAML profile):
yaml methodConfig:
- name: [{service: payments.v1.Payouts, method: GetStatus}]
retryPolicy:
maxAttempts: 4 initialBackoff: 100ms maxBackoff: 1s backoffMultiplier: 2.0 retryableStatusCodes: [UNAVAILABLE, DEADLINE_EXCEEDED]
gRPC-Gateway (OpenAPI fragment for transcoding):
yaml paths:
/v1/payouts/{id}:
get:
x-grpc-service: payments.v1.Payouts x-grpc-method: GetStatus

Resume Summary

gRPC is a working end-to-end bus for iGaming microservices: compact binary protocols, strict contracts and powerful streaming. So that it brings real benefits, keep contracts small and stable, implement deadlines/cancellations/retrays with idempotency, exploit Envoy/xDS and mTLS, measure p95/p99 and teach the system to live under backpressure. In conjunction with REST webhooks and GraphQL-BFF, you get a fast, economical and secure API layer that scales with the product.

Contact

Get in Touch

Reach out with any questions or support needs.We are always ready to help!

Start Integration

Email is required. Telegram or WhatsApp — optional.

Your Name optional
Email optional
Subject optional
Message optional
Telegram optional
@
If you include Telegram — we will reply there as well, in addition to Email.
WhatsApp optional
Format: +country code and number (e.g., +380XXXXXXXXX).

By clicking this button, you agree to data processing.