Kubernetes: Clusters and Helm charts
Kubernetes: Clusters and Helm charts
1) Cluster Architecture - Top View
Control Plane: 'kube-apiserver', 'etcd', 'kube-scheduler', 'kube-controller-manager', (part is hidden in controlled clouds).
Worker: 'kubelet', CRI runtime (containerd/CRI-O), CNI plugin, kube-proxy/ebpf-proxy.
Intracluster network: Pod-to-Pod, Service-VIP/ClusterIP, DNS CoreDNS.
Storage: CSI drivers, dynamic PVC → PV provisioning (StorageClass).
Failure limits: node/AZ/region. Place replicas by zone (TopologySpreadConstraints/anti-affinity).
Typical roles
Platform command: creates/upgrades clusters, CNI/CSI/Ingress, policy and GitOps.
Product teams: deposit charts/releases, follow security policies and resources.
2) Cluster lifecycle
Creation: kOps, kubeadm, Rancher, EKS/AKS/GKE. Enable OIDC authentication and auditing right away.
Upgrades: minor versions in turn (control plane → nodes), controlled by maxUnavailable, tests on staging.
Add-ons (all via Helm/GitOps): CNI (Calico/Cilium), CSI driver, Ingress controller (NGINX/Gateway API/Contour/Traefik), Metrics-Server, Cluster-Autoscaler, Node-Local DNS, logging/metrics/trace.
Backups: etcd snapshot (if self-managed), Velero for namespace/PVC.
3) Networks, services and ingress
CNI: Calico (NetworkPolicy), Cilium (eBPF/servicemesh-фичи).
Service: 'ClusterIP', 'NodePort', 'LoadBalancer' (L4 cloud balancing), 'ExternalName'.
Ingress/Gateway API: L7 routing, TLS, rate-limit/WAF history at the perimeter.
NetworkPolicy: deny-all + explicit allow by namespace/label by default.
Headless-service ('clusterIP: None') for StatefulSet and service discovery.
4) Storage (CSI) and states
StorageClass: 'reclaimPolicy', 'volumeBindingMode' ('WaitForFirstConsumer' for better placement).
StatefulSet: stable names/volumes ('volumeClaimTemplates'), 'podManagementPolicy: Parallel' for quick scans.
ReadWriteMany: use distributed file (EFS/Filestore) carefully - assess latency.
Snapshots: 'VolumeSnapshotClass' + cron backups.
5) Multi-tenancy and politics
Namespaces by product/environment.
RBAC: minimal roles, separate service accounts, 'Role '/' RoleBinding' instead of 'ClusterRole' where possible.
PSA (Pod Security Admission): 'baseline '/' restricted' modes (PSP replacement).
ResourceQuota / LimitRange: потолки CPU/Memory/PVC/LoadBalancer.
OPA Gatekeeper/Kyverno: admission policy (e.g. prohibition ': latest', requirement 'resources', 'readOnlyRootFilesystem').
ImagePolicy/webhooks: image signature verification (cosign/policy-controller).
6) Observability and operation
Metrics: Prometheus stack, kube-state-metrics, node exporters.
Logs: Fluent Bit/Vector → object/ES/OpenSearch, rotation on nodes.
Trails: OpenTelemetry Collector.
SLO dashboards: RED model on ingress and key services.
Autoscale: HPA (by application metrics), VPA for background, Cluster-Autoscaler for nodes.
7) Manifest patterns (cheat sheet)
Deployment:yaml apiVersion: apps/v1 kind: Deployment metadata: { name: api, labels: { app: api } }
spec:
replicas: 3 strategy: { type: RollingUpdate, rollingUpdate: { maxUnavailable: 0, maxSurge: 1 } }
selector: { matchLabels: { app: api } }
template:
metadata:
labels: { app: api }
spec:
serviceAccountName: api-sa securityContext: { runAsNonRoot: true, fsGroup: 2000 }
containers:
- name: api image: registry. example. com/api:1. 2. 3 ports: [{ containerPort: 8080 }]
resources: { requests: { cpu: "200m", memory: "256Mi" }, limits: { cpu: "1", memory: "512Mi" } }
readinessProbe: { httpGet: { path: /healthz, port: 8080 }, periodSeconds: 5 }
livenessProbe: { httpGet: { path: /livez, port: 8080 }, initialDelaySeconds: 20 }
StatefulSet (snippet):
yaml apiVersion: apps/v1 kind: StatefulSet metadata: { name: db }
spec:
serviceName: db replicas: 3 podManagementPolicy: Parallel selector: { matchLabels: { app: db } }
template:
metadata: { labels: { app: db } }
spec:
containers:
- name: db image: postgres:16-alpine volumeMounts: [{ name: data, mountPath: /var/lib/postgresql/data }]
volumeClaimTemplates:
- metadata: { name: data }
spec:
accessModes: ["ReadWriteOnce"]
resources: { requests: { storage: 100Gi } }
storageClassName: fast-ssd
PDB (PodDisruptionBudget):
yaml apiVersion: policy/v1 kind: PodDisruptionBudget metadata: { name: api-pdb }
spec:
minAvailable: 2 selector: { matchLabels: { app: api } }
Ingress (Nginx, brief):
yaml apiVersion: networking. k8s. io/v1 kind: Ingress metadata:
name: api annotations:
nginx. ingress. kubernetes. io/proxy-read-timeout: "30"
spec:
tls: [{ hosts: ["api. example. com"], secretName: api-tls }]
rules:
- host: api. example. com http:
paths:
- path: /
pathType: Prefix backend: { service: { name: api, port: { number: 80 } } }
8) Helm v3 - basics and structure
Chart = templates + values + metadata.
mychart/
Chart. yaml # name, version (semver), type (application/library), dependencies values. yaml # default values. schema. json # (recommended) validation values templates/# .yaml. gotmpl (Deployment, Service, Ingress, …)
templates/tests/ # helm tests (smoke)
charts/# local dependencies (or OCI dependencies)
Chart. yaml (example):
yaml apiVersion: v2 name: api description: API service type: application version: 1. 4. 0 # chart version (semver)
appVersion: "1. 2. 3" # dependencies application version:
- name: redis version: 17. x.x repository: "oci://registry. example. com/charts"
9) Helm Templates - Practices
Use helpers in '_ helpers. tpl 'for names/labels/annotations.
Specify 'resources', 'securityContext', 'readiness/liveness' everywhere.
Generate labels according to a standardized scheme ('app. kubernetes. io/`).
Make features optional through'values' (ingress/hpa/pdb/servicemonitor).
Include'values. schema. json '- stop from incorrect configs.
For sensitive data - Secrets from external operators (External Secrets, SOPS), and not store in values.
gotmpl
{{- define "api. fullname" -}}
{{- printf "%s-%s".Chart. Name. Release. Name trunc 63 trimSuffix "-" -}}
{{- end -}}
Deployment. tpl (fragment):
gotmpl apiVersion: apps/v1 kind: Deployment metadata:
name: {{ include "api. fullname". }}
labels: {{- include "api. labels". nindent 4 }}
spec:
replicas: {{.Values. replicaCount }}
strategy:
rollingUpdate:
maxSurge: 1 maxUnavailable: 0 selector:
matchLabels: {{- include "api. selectorLabels". nindent 6 }}
template:
metadata:
labels: {{- include "api. selectorLabels". nindent 8 }}
spec:
serviceAccountName: {{ include "api. serviceAccountName". }}
securityContext: {{- toYaml. Values. podSecurityContext nindent 8 }}
containers:
- name: {{.Chart. Name }}
image: "{{.Values. image. repository }}:{{.Values. image. tag }}"
imagePullPolicy: IfNotPresent ports: [{ containerPort: {{.Values. service. port }} }]
resources: {{- toYaml. Values. resources nindent 10 }}
envFrom:
- secretRef: { name: {{.Values. secretsRef }} }
10) Dependencies, repositories and OCI
Helm v3 supports OCI registers: 'oci ://registry/org/charts'.
Lock dependency versions ('^ 1. 2. 0`, `~1. 2 ') and run' helm dependency build '.
Sign the chart (prov), store artifacts in the CI artifact repository.
Library charts: general templates (ingress/servicemonitor) for reuse.
11) Hooks, CRD and order of operations
Hooks: `pre-install`, `post-install`, `pre-upgrade`, `post-upgrade`, `test`. Add policies ('before-hook-creation', 'hook-succeeded').
CRD: put in 'crds/' (set to templates), avoid CRD updates on the fly - migrate separately.
Database migrations/initialization - job-hook with idempotency and timeouts.
12) Chart and CI testing
'helm lint '+ validation of the scheme.
Helm unittest (unit), chart-testing (ct) - assembly/installation in kind/minikube in CI.
Snapshot tests of templates ('helm template' → compare with a template).
Smoke tests' helm test '(raise' Pod'with checks).
13) GitOps (Argo CD/Flux)
The source of truth is the repository. The chart is stored as HelmRelease/HelmChart (Flux) or Application (Argo).
Sink policies: auto-sync with prune/self-heal, statuses and health-checks.
Promotion versions: tag-bots/semver-range, PR-flow.
Divide repo into apps (charts) and env (overrides/values).
Secret management: SOPS (age/KMS), External Secrets.
14) Safety: minimum required
PSA restricted: no privileges, no hostPath, limited capabilities, read-only rootfs.
ImagePolicy - signed/trusted images only.
NetworkPolicy: "locked by default."
RBAC: per-app service account, 'Role '/' RoleBinding' in namespace.
Admission control: Gatekeeper/Kyverno rules (resources/limits, labels, no latest).
Secrets: SOPS/External Secrets; do not put secrets in values/plain Git.
15) Anti-patterns
': latest' in charts and imagery; absence of'values. schema. json`.
One huge chart "for everything" instead of modular.
CRDs are updated with templates in 'templates/' → chaos on upgrades.
Hard-coded names/port/namespace in templates.
Lack of resources/limits and samples → latency drift and instability.
No PDB → zero downtime is not possible with drain/upgrades.
Secrets in Git without encryption; manifests without checks policy.
16) Implementation checklist (0-45 days)
0-10 days
Create a basic chart skeleton with '_ helpers. tpl ', labels, probes, resources, PDB/Ingress optional.
Включить PSA restricted, NetworkPolicy deny-all, ResourceQuota/LimitRange.
Configure GitOps (Argo/Flux), private register, image/chart signature.
11-25 days
Divide the chart into modules/dependencies, add'values. schema. json ', tests (' helm lint ', unit, ct).
Connect observability (ServiceMonitor/PodMonitor), log agents, OTel.
Enter the upgrade process: staging → canary → prod, hook migration with rollback.
26-45 days
Automate dependency updates (bots/semver-ranges + PR).
Add Gatekeeper/Kyverno policies and policy reports to CI.
Document the cluster upgrade runbook, DR procedures (Velero/etcd snapshot).
17) Maturity metrics
100% of applications are depleted via Helm/GitOps, without'kubectl apply'manually.
All charts have'values. schema. json ', tests, signature, and committed dependency versions.
PSA restricted/NetworkPolicy is enabled in all namespace.
PDB and HPA are present in all critical services.
SOPS/External Secrets, no latest policy, image signature.
Cluster and chart upgrades are performed without downtime (canary/blue-green), restore tests are regular.
18) Conclusion
Strong Kubernetes foundation = robust cluster architecture + strict policies + industrial quality Helm charts managed by GitOps. Standardize templates, secure the PSA/NetworkPolicy/RBAC environment, validate values, and automate tests, signature, and promotion. Then upgrades and releases will become predictable, and the platform will become stable and convenient for product teams.