Containerization: Docker and OCI
Containerization: Docker and OCI
1) Basic OCI concepts and standards
OCI Image Spec - image format (manifest, config, layers, index for multi-arch).
OCI Runtime Spec - how to run a container (bundle, 'config. json`); implementation: runc as well as gVisor, Kata Containers.
OCI Distribution Spec - interaction with registries (push/pull, authorization).
Docker = UX and the ecosystem around OCI: Dockerfile/BuildKit/CLI/Compose/Hub. In Kubernetes, Docker Engine is replaced by containerd/CRI-O, but the image format is the same.
2) Appearances: layers, tags, metadata
Образ = слои (layered filesystem) + config (entrypoint/cmd/env/labels) + manifest.
Tags: do not use ': latest' in prod; pinning ': 1. 21. 3 ', git-SHA or date + SHA.
LABEL: owner, contact, vcs-url, org. opencontainers. (title, description, revision, source).
Multi-arch: The index manifest gives the correct option for 'amd64/arm64'.
3) Build: Dockerfile, BuildKit, multi-stage
3. 1 Principles
Minimize layers, fix versions, clean package manager caches.
First copy the manifest/lock files, then 'RUN install deps' - improves the cache.
.dockerignore is required (exclude '.git', artifacts, secrets).
Examples of distroless/alpine/minimum bases are preferred.
3. 2 BuildKit chips
Parallel builds, secrets in the assembly ('--secret'), cache mounts, buildx for multi-arch.
Example of a cache mount:dockerfile syntax=docker/dockerfile:1. 6
RUN --mount=type=cache,target=/root/.cache/pip pip install -r requirements. txt
3. 3 Multi-stage examples
Go (statically linked, distroless):dockerfile syntax=docker/dockerfile:1. 6
FROM golang:1. 23 AS build
WORKDIR /src
COPY go. mod go. sum./
RUN go mod download
COPY..
RUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -ldflags="-s -w" -o /app
FROM gcr. io/distroless/static:nonroot
USER 65532:65532
COPY --from=build /app /app
ENTRYPOINT ["/app"]
Node. js (prod-layer without dev-deps):
dockerfile syntax=docker/dockerfile:1. 6
FROM node:22-alpine AS deps
WORKDIR /app
COPY package. json./
RUN npm ci --omit=dev
FROM node:22-alpine AS build
WORKDIR /app
COPY --from=deps /app/node_modules./node_modules
COPY..
RUN npm run build
FROM node:22-alpine
WORKDIR /app
ENV NODE_ENV=production
COPY --from=deps /app/node_modules./node_modules
COPY --from=build /app/dist./dist
USER node
CMD ["node","dist/server. js"]
Python (wheel-кеш, non-root):
dockerfile syntax=docker/dockerfile:1. 6
FROM python:3. 12-slim AS base
ENV PYTHONDONTWRITEBYTECODE=1 PYTHONUNBUFFERED=1
WORKDIR /app
FROM base AS deps
RUN --mount=type=cache,target=/root/.cache/pip pip install --upgrade pip
COPY requirements. txt.
RUN --mount=type=cache,target=/root/.cache/pip pip wheel --wheel-dir=/wheels -r requirements. txt
FROM base
COPY --from=deps /wheels /wheels
RUN pip install --no-index --find-links=/wheels -r /app/requirements. txt && rm -rf /wheels
COPY..
USER 1000:1000
CMD ["python","-m","app"]
Java (JLink/Layered Spring):
dockerfile syntax=docker/dockerfile:1. 6
FROM maven:3. 9-eclipse-temurin-21 AS build
WORKDIR /src
COPY pom. xml./
RUN mvn -q -e -DskipTests dependency:go-offline
COPY..
RUN mvn -q -DskipTests package
FROM eclipse-temurin:21-jre
WORKDIR /app
COPY --from=build /src/target/app. jar /app/app. jar
ENTRYPOINT ["java","-XX:+UseContainerSupport","-jar","/app/app. jar"]
4) Minimum images, PID 1 and signals
Distroless - smaller attack surface, no shell/package manager.
PID 1 must correctly proxy signals, otherwise "zombie processes." Use'ENTRYPOINT 'in exec form and tini/init init:dockerfile
ENTRYPOINT ["tini","--","/app"]
'HEALTHCHECK'is reasonable (frequency/timeout, no unnecessary load).
5) Container security
5. 1 Policies and hardening
Non-root (USER), rootless Docker/containers.
Capabilities: remove unnecessary ('--cap-drop = ALL --cap-add = NET _ BIND _ SERVICE', etc.).
seccomp/AppArmor/SELinux: Enable default or strict profiles.
Read-only FS + `tmpfs` для `/tmp`, no-new-privileges.
Secrets: not in images; mount from the K8s/vault/docker secrets manager.
5. 2 Supply chain
SBOM (CycloneDX/SPDX) and scanning (Trivy/Grype).
Signature (cosign, sigstore) and pull policy (verify).
Rehearsals for updates: base images with CVE patches are regularly rebuilt.
6) Storage and file drivers
Default is overlay2 (fast and stable). In rootless environments, often fuse-overlayfs.
volumes for data and caches, bind-mount for development.
Do not write to '/' - use the data path ('/data '), separate state from the image.
7) Network and DNS
Docker networks: bridge (default), host (minimum overhead, port conflicts), none, macvlan/ipvlan (L2/L3 integration).
The Docker DNS resolver takes/daemon from the host. json; for prod, configure the local resolver cache.
In K8s, the network is managed by CNI (Calico/Cilium/Flannel). For sidecar/mesh - intercepts (iptables).
8) Resources and QoS (cgroups v2)
Restrictions: '--cpus', '--memory', '--pids-limit', '--cpuset-cpus'.
Set requests/limits (in K8s) → affects scheduling and QoS.
Monitor OOMKilled, throttling, latency spikes due to GC/IO.
bash docker run --cpus=1. 5 --memory=512m --pids-limit=256 --read-only --tmpfs /tmp:rw,size=64m...
9) Logs and observability
Log drivers: 'json-file' (with rotation), 'journal', 'gelf', 'awslogs', 'syslog'.
Set up rotation:json
{ "log-driver":"json-file","log-opts":{"max-size":"10m","max-file":"5"} }
Metrics: Docker Engine API, cAdvisor, node exporters; tracing through an agent in a container or sidecar.
10) Registers and authentication
Private registries: ECR/GCR/ACR/Harbor/GitHub Container Registry.
Rate-limits Docker Hub; use mirrors/caches (registry-cache).
Retention/immutable tags policy, replication between regions.
'docker login'is not stored in scripts; use CI secrets and OIDC federation.
11) docker-compose vs orchestrators
Compose - local development/integration stands.
Прод: Kubernetes (Deployment/StatefulSet/DaemonSet, Ingress, Secrets, PVC) с containerd/CRI-O; security policies and rollout strategies.
Swarm is outdated for large sales, suitable for simple clusters.
yaml version: "3. 9"
services:
api:
build:.
ports: ["8080:8080"]
environment: ["DB_URL=postgres://pg/DB"]
depends_on: ["pg"]
pg:
image: postgres:16-alpine volumes: ["pgdata:/var/lib/postgresql/data"]
volumes: { pgdata: {} }
12) Healthcheck, start/stop, graceful shutdown
Use'HEALTHCHECK 'with timeouts and' retries' restrictions.
Correct graceful: catch SIGTERM, terminate incoming, close connections, then exit.
В K8s: `preStop` hook + `terminationGracePeriodSeconds`, readiness перед liveness.
13) Best practices by language/stack (summary)
Node: 'npm ci', 'NODE _ ENV = production', disable dev-deps in runtime, '--heapsnapshot' off, 'uWS/GZip' behind L7 proxy.
Python: wheels, 'gunicorn --graceful-timeout', 'GTHREADS '/' UVICorn' by CPU, do not unnecessarily store venv inside a common layer.
Go: CGO off (if possible),' -ldflags =" -s -w "', distroless/static, 'GOMAXPROCS' by cgroups.
Java: layered JAR, '-XX: MaxRAMPercentage', CDS/Layered JAR for cache.
14) Supply chain and image politics
Generate SBOM on CI, save next to the artifact.
Scan images on each pooch; gate to critical CVEs.
Sign images (cosign), enable policy controller (in K8s - Kyverno/Conftest/Gatekeeper).
Separate build and run accounts/networks; Cache dependencies in the private registry.
15) Anti-patterns
': latest' in prod; lack of immutable tags.
Assembly "inside the production host" without isolation; storing secrets in Dockerfile.
Running as root, '--privileged', broad capabilities.
Thick images (> 1-2 GB), none. dockerignore.
The init logic in ENTRYPOINT through the shell form → signal problems.
Write persistent data to the container layer instead of volume.
Healthcheck, which makes expensive requests to the prod-DB.
16) Implementation checklist (0-45 days)
0-10 days
Standardize Dockerfile (multi-stage, .dockerignore, LABEL, pinned base).
Include BuildKit/buildx, cache mounts for package managers.
Switch to non-root and 'seccomp '/AppArmor/SELinux default profiles.
11-25 days
Minimize runtime images (alpine/distroless), put things in order with logs (rotation).
Set up resource limits, healthchecks, correct PID 1/tini.
Raise the private registry/cache, connect the CVE scanner and SBOM generation.
26-45 days
Enter image signature and cluster admission policy.
Organize multi-arch (amd64/arm64) for the required services.
Document build/release runbook, build size/vulnerability/time report.
17) Maturity metrics
Immutable tags and reproducible assemblies for ≥ 95% of services.
The average runtime image size is <200-300 MB (stacked).
100% of prod-containers are non-root, with limited capabilities and read-only FS.
SBOM and CVE scan per push; critical CVEs are → blocked.
Image signing and policy-enforcement in environments.
Container cold start time ≤ target SLO (e.g. 2-5 seconds), correct graceful shutdown.
18) Conclusion
Adult containerization is OCI standards + build discipline + default security + observability and delivery policy. Use multi-stage and BuildKit, minimize runtime images, run non-root under strict profiles, fix tags, scan and sign, keep logs/resources/network under control. So containers will become the predictable and manageable foundation of your platform - from development to production.