Technology and Infrastructure → Kubernetes clusters and Helm charts
Kubernetes clusters and Helm charts
1) Role of Kubernetes and Helm
Kubernetes is the basis of the application platform: it standardizes rolling, networking, configs, secrets and self-healing. Helm is a package/template manager that turns declarative manifests into repeatable releases with version control and dependencies. Together they provide predictable dispatches, quick rollbacks and a single infrastructure language.
2) Cluster design
2. 1 Topology and fault tolerance
Multi-AZ: control plane and worker pool nodes are zoned; PDB/TopologySpreadConstraints for uniformity.
Multi-region/DR: independent per-region clusters; interregional calls - only on "cold" paths (directories/telemetry), "hot" (wallet) - locally.
Worker pools by profile: 'general', 'compute', 'io', 'spot' (for background tasks). Assignment via nodeSelector/affinity/tains.
2. 2 Namespaces and multi-user model
Namespace-isolation by domains/commands: 'payments', 'wallet', 'games', 'reporting'.
ResourceQuota + LimitRange: basic CPU/RAM limits and maximum replicas; cluster protection against "vacuum cleaners."
RBAC: read-only roles by default, write - only CI/CD and on-call.
2. 3 Network
CNI with NetworkPolicy support (Calico/Cilium): Policy L3/L4 by namespace/label.
Ingress → Gateway API: switch to the'GatewayClass/Gateway/HTTPRoute' model for canaries and multi-tenancy.
Service Mesh (optional): mTLS, retry/breaker, locality-aware; turn on point for inter-service reliability.
3) Reliability and scalability
3. 1 Scaling
HPA by user metrics (RPS/latency/queue depth), not just CPU.
VPA on background load class; in the product - "recommendation only" or together with HPA on different metrics.
Cluster Autoscaler: separate node groups for sensitive services; warm-pool to picks (tournaments/matches).
3. 2 Resources and QoS
Each Pod has requests/limits; avoid ': latest' and 'unlimited' containers.
PriorityClass: critical services ('wallet', 'payments') displace non-critical ones.
PDB: do not let the cluster "shoot yourself in the foot" when updating nodes.
3. 3 Upgrades without downtime
RollingUpdate with maxUnavailable = 0 on critical paths.
PodDisruptionBudget + ReadinessProbes (не `startupProbe` вместо readiness).
Surge capacity for fast releases during peaks - with caution.
4) Platform security
Pod Security (Baseline/Restricted) at the namespace level; 'privileged 'disallowing, hostPath, root.
NetworkPolicy: default-deny and whitelisting by port/label.
Seccomp/AppArmor, non-root users, read-only rootfs.
Secrets: KMS/Vault provider (CSI), do not keep secrets in'values. yaml 'in open form.
RBAC minimum: we issue service accounts only the necessary rights; short-lived tokens.
Admission control: OPA/Gatekeeper/Kyverno - enforce labels, limits, policy violations.
5) Observability
OpenTelemetry: tracing from Ingress/Gateway → service → database/cache, mandatory labels' service ',' version ',' region ',' partner ',' api _ version '.
Logs: structured, no PII/PAN; Routing to centralized storage
Metrics: RED/USE, SLO-dashboards, burn-rate alerts.
Synthetics: samples from the right countries/ASN; perimeter and internal health-checks.
6) GitOps и progressive delivery
Argo CD/Flux: the desired state is stored in Git; each namespace has its own repository/folder.
Promotion of artifacts: 'dev → stage → prod' via PR, not "kubectl apply."
Canary/Blue-Green: Argo Rollouts/Gateway API; success metrics - P95/P99, error-rate, business SLI (CR of deposits).
Rollbacks: in Helm/Argo - by button; in the charts - versions are fixed.
7) Helm: best practices
7. 1 Chart structure
my-service/
Chart. yaml # name, version (SemVer), appVersion values. yaml # base values (no secrets)
values-prod. yaml # prod overrides (no secrets)
templates/
_helpers. tpl # naming, common deployment templates. yaml service. yaml hpa. yaml pdb. yaml networkpolicy. yaml serviceaccount. yaml ingress_or_gateway. yaml charts/# dependencies (opcional)
Recommendations:
- 'version '- chart version (SemVer), 'appVersion' - application (image) version.
- Strong resource names are '{{include' svc. fullname."}}' + labels' app. kubernetes. io/`.
- Required manifests: Deployment/StatefulSet, Service, ServiceAccount, HPA (if applicable), PDB, NetworkPolicy.
7. 2 Values-strategy
Basic'values. yaml '- defaults, without secrets and environment-specifics.
Overrides: 'values- {stage' prod} .yaml '+ per-region files.
Secrets: SOPS ('values-prod. sops. yaml ') or Vault injection via CSI.
Parameters of resources and samples - in values with "reasonable" defaults.
7. 3 Dependencies and common code
Common charts libraries for patterns (probes, annotations, NetworkPolicy).
Dependencies ('requirements '/' Chart. yaml ') fix by version; avoid deep "nesting dolls."
7. 4 Templates and checks
Use 'required' and 'fail' in '_ helpers. tpl'for critical values.
Validation of the values - 'values scheme. schema. json`.
Unit chart tests - 'helm unittest'; static analysis - kubeconform/kubeval.
Local debugging - 'helm template' + '--values' + 'kubeconform'.
7. 5 Releases and storage
Push the chart to OCI container registers; tags by SemVer.
Helmfile/`helmfile. d'for orchestration of multi-chart sheaves.
CI artifacts: generated manifests + lockfile dependencies.
8) Example: Deployment (Helm template fragment)
yaml apiVersion: apps/v1 kind: Deployment metadata:
name: {{ include "svc. fullname". }}
labels: {{ include "svc. labels". nindent 4 }}
spec:
replicas: {{.Values. replicas default 3 }}
strategy:
type: RollingUpdate rollingUpdate:
maxSurge: 1 maxUnavailable: 0 selector:
matchLabels: {{ include "svc. selectorLabels". nindent 6 }}
template:
metadata:
labels: {{ include "svc. selectorLabels". nindent 8 }}
annotations:
checksum/config: {{ include (print $.Template. BasePath "/configmap. yaml"). sha256sum }}
spec:
serviceAccountName: {{ include "svc. serviceAccountName". }}
securityContext:
runAsNonRoot: true containers:
- name: app image: "{{.Values. image. repository }}:{{.Values. image. tag }}"
imagePullPolicy: IfNotPresent ports:
- name: http containerPort: {{.Values. ports. http }}
resources:
requests:
cpu: {{.Values. resources. requests. cpu }}
memory: {{.Values. resources. requests. memory }}
limits:
cpu: {{.Values. resources. limits. cpu }}
memory: {{.Values. resources. limits. memory }}
readinessProbe:
httpGet:
path: /healthz port: http periodSeconds: 5 envFrom:
- secretRef:
name: {{ include "svc. secretsName". }}
9) Secrets and configurations
Secrets via CSI (Vault/KMS) or SOPS in the Git repository (GPG/KMS keys; 'kubectl edit'is forbidden).
ConfigMap/Secret checksum annotations for rolling release trigger.
Do not store PAN/PII; use tokenization.
Sealed Secrets are allowed, but SOPS or direct CSI is preferred.
10) Network and perimeter
Gateway API for L7 routing, canaries and blue-green; sticky sessions only when necessary.
mTLS between services via mesh/sidecar-less (Cilium) - point for the payment core.
Egress: controlled list of external nodes (PSP/KYC), fixed NAT-IP, timeouts and retray budget.
11) Stateful services and data
For OLTP databases, use managed cloud services or operators (Postgres/MySQL) in separate clusters.
PVC/CSI with snapshots and backups policy; 'PodAntiAffinity' for replicas.
For queues/streaming - managed solutions or dedicated clusters; in a "common" application cluster, keep a minimum of state.
12) CI/CD conveyor (reference)
1. Build & test → 2) SCA/lint → 3) Image in register (SBOM, signature) →
2. Helm chart generation + 'helm unittest' + kubeconform →
3. SOPS decryption in CD → 6) PR runtime in the GitOps repository →
4. Argo/Flux applies → 8) Argo Rollouts canary → 9) SLO Auto Verdict → 10) Promotion/Rollback.
13) Platform maturity metrics
Share of releases via GitOps (target: 100%).
Rolling time (P95) until ready, MTTR rollback.
Coverage of Namespace Pod Security and NetworkPolicy (target: 100%).
% of services with HPA and correct requests/limits.
% chart with'values. schema. json'and unit tests.
Incidents caused by "manual" changes (target: 0).
14) Implementation checklist
1. Clusters by zones, pool of nodes by profiles; PDB/TopologySpreadConstraints.
2. Namespace model, ResourceQuota/LimitRange, RBAC minimum.
3. Pod Security (Restricted) и default-deny NetworkPolicy.
4. Gateway API/Ingress; egress control and NAT fixation to providers.
5. Observability: OTel trails, RED/USE, geo synthetics; SLO dashboards.
6. GitOps (Argo/Flux), canary/blue-green, auto-promotion by metrics.
7. Helm standards: structure, schema. json, tests, SOPS/Vault, OCI registers.
8. HPA/VPA, Cluster Autoscaler, warm-pool to peaks.
9. Data operations: CSI snapshots, backups ,/managed database operators.
10. Regular DR/chaos tests and game days.
15) Anti-patterns
One "giant" cluster for everything without isolation and quotas.
Containers without resource restrictions, 'latest' tags, no probes.
Secrets in'values. yaml 'in clear text,' kubectl edit 'in prod.
Releases past GitOps, manual manifest edits on a live cluster.
Lack of NetworkPolicy/Pod Security - "flat" network.
A single common HPA signal across the CPU for different types of loads.
Storage of OLTP databases within a "common" application cluster without an operator and backups.
16) iGaming context/fintech: practical notes
Payment webhooks: dedicated ingress/gateway and narrow egress to PSP; strict timeouts/retrays; Individual host pool.
VIP traffic: prioritization and individual routes; PDB and topology spread for stability.
Tournaments/picks: warm-pool nodes + predictive HPA; warming up caches/connections.
Reporting/CDC: separate cluster/pool so that ETL does not affect Prod.
Regulatory: immutable logs (WORM), PII tokenization, network segmentation.
Total
A strong Kubernetes platform is not a "YAML heap," but standards: isolation, security policy, managed resources, observability and GitOps discipline. Helm Charts - Your Supply Contract: Predictable Releases, Testable Patterns, Secure Secret Handling and Simple Kickbacks. By consolidating these principles, you get clusters that survive peaks, accelerate releases, and withstand business and regulatory demands.