Vulnerability scanning and patches
Brief summary
Vulnerability management is a continuous cycle: detection → risk assessment → elimination (patch/migration/config) → verification → reporting. Scanning technologies (SCA/SAST/DAST/IAST/Cloud/Container) give signals, and the context (exposure, privileges, data, EPSS, exploits) determines the priority. The goal is to reduce real risk without business downtime using automation, canary calculations and clear SLOs.
Scanning taxonomy
SCA (Software Composition Analysis): dependency/license analysis; CVE discovery in libraries, SBOM.
SAST-Static analysis of native code before assembly.
DAST: dynamic black box vs. running service.
IAST: in-app sensors (during tests) - less FP, deeper context.
Container/OS Scan: images (base image, packages), hosts (kernel/packages/configs), CIS benchmarks.
Cloud/Infra (CSPM/KSPM): cloud/K8s misconfigs (IAM, networks, encryption, public buckets).
Secrets Scan: key/token leaks in repositories and images.
Binary/Artifact Scan: verification of collected artifacts (signatures, vulnerabilities).
Risk model and prioritization
Score = CVSS v3. x (base) × EPSS (probability of exploitation) × context (exposure, data, privileges, compensatory measures).
Context factors:- Exposure on the Internet/inside, presence of WAF/mTLS/isolation.
- Data: PII/finance/secrets.
- Process/node privileges, lateral movement potential.
- Availability of public exploit/mass attacks, compliance requirements.
Example CVSS vector: 'CVSS: 3. 1/AV: N/AC: L/PR: N/UI: N/S: U/C: H/I: H/A: H '→ criticized; if the service is public and without compensatory measures - P1.
SLO thresholds (example):- P1 (critical, operated): fix ≤ 48 h.
- P2 (high): fix ≤ 7 days.
- P3 (mean): fix ≤ 30 days.
- P4 (low/inform) : planned/by backlog.
Vulnerability Management Lifecycle
1. Asset inventory: services, images, clusters, OS, packages, dependencies, versions.
2. Scheduled and event scanning: commits, builds, dumps, daily/weekly windows.
3. Triage: deduplication, normalization (CVE→Ticket), mapping to owner.
4. Prioritization by context: CVSS/EPSS + exposure/data.
5. Remediation: patch/dependency update/config hardnening/virtual patch (WAF).
6. Verification: rescanning, tests, canary.
7. Reporting: closure metrics, age of vulnerabilities, SLO compliance.
8. Lessons: fix in templates (base image, Helm chart), policy for the future.
Integration into CI/CD
In step PR: SAST + SCA + secret scan; "break build" by P1/P2 or application requirement.
At the build stage: image scan, SBOM generation (CycloneDX/SPDX), artifact signature (cosign).
At the deploy: Admission policy stage - prohibit images with'critical/high 'vulnerabilities and unsigned/SBOM.
Post-stage: DAST/IAST against staging and partially production (safe profiles).
Example: Renovate/Dependabot (fragment)
json
{
"extends": ["config:recommended"],
"vulnerabilityAlerts": { "enabled": true },
"packageRules": [
{ "matchUpdateTypes": ["minor","patch"], "automerge": true },
{ "matchManagers": ["dockerfile"], "enabled": true }
]
}
Admission policy (Kubernetes, OPA/Gatekeeper - simplified)
rego package policy.vuln
deny[msg] {
input.image.vuln.critical > 0 msg:= sprintf("Image %s has critical vulns", [input.image.name])
}
deny[msg] {
input.image.sbom == false msg:= sprintf("Image %s without SBOM", [input.image.name])
}
Patch management (fixed assets, containers, K8s)
ОС (Linux/Windows)
Patch window: regular windows + emergency extraordinary windows for P1.
Strategy: Canary 5-10% nodes first, then waves.
Auto-deployment: Ansible/WSUS/Intune/SSM; constraint checking and rollbacks.
Kernel Live Patching (where possible) to minimize downtime.
Restart services: managed drain/cordon for K8s nodes, graceful shutdown.
Containers
Immutable approach: not "apt upgrade" in runtime; rebuild the image with the updated base.
Base images: regularly update golden images (Alpine/Debian/Distroless), fix versions (digest).
Multi-stage assemblies: minimize surface (remove build-tools).
Pre-Deploy Scan - Block of images with critical CVEs.
Kubernetes/Service Mesh
Control Plane: timely minor releases, closing CVE k8s/etcd/containerd.
Node OS/Container runtime: scheduled updates, version compatibility.
Mesh/Ingress: Envoy/Istio/NGINX versions are critical (often CVE in parsers/NTTR3).
Admission Policies: ban ': latest', signature requirement, vulnerability limits.
Virtual patches and compensatory measures
When a patch is not possible quickly:- WAF/WAAP: signature/positive model for a specific endpoint.
- Feature Flags: Disable vulnerable functionality.
- Network ACL/mTLS/IP allow-list: restrict access to the vulnerable service.
- Config hardnening: lowering rights, sandbox, read-only FS, disabling dangerous modules.
- Reduction of TTL tokens/keys, rotation of secrets.
Risk Acceptance
An exception is issued with a ticket with: justification, compensatory measures, SLA for elimination, revision date.
Report as "temporary risk acceptance" and include in monthly review.
Observability and metrics
Technical:- Mean Time To Patch (MTTP) по P1/P2/P3.
- Share of assets covered by scanning (%).
- Age of open vulnerabilities (p50/p90), backlog burn-down.
- Percentage of images with SBOM and signature.
- Completion of SLOs by closing dates (e.g. 95% P1 ≥ ≤ 48 hours)
- Impact on uptime (number of patch incidents).
- Repeated detection of the same CVE (fix quality in templates).
Playbooks (abbreviated)
P1: Critical RCE in public service
1. Activate WAF rule/wirth patch.
2. Block access to unauthorized sources (if applicable).
3. Urgent image rebuild/OS patch, → wave canary.
4. Repeated DAST/telemetry check, error monitoring.
5. Post-incident: fix the fix in the base image/Helm chart, add the test to the CI.
1. Immediate rotation of secrets/keys, revocation of tokens.
2. Finding traces of use, limiting endpoints.
3. Scans of repo/images for secrets, implementation of a pre-commit scanner.
Examples of artifacts
1) SQL report on hot vulnerabilities
sql
SELECT service, cve, cvss, epss, exposed, has_exploit, created_at,
PRIORITY(exposed, has_exploit, cvss, epss) AS prio
FROM vuln_findings
WHERE status = 'open' AND (cvss >= 8.0 OR has_exploit = true)
ORDER BY prio DESC, created_at ASC;
2) Admission policy (Kyverno, critical vulnerability block)
yaml apiVersion: kyverno.io/v1 kind: ClusterPolicy metadata:
name: block-critical-vulns spec:
validationFailureAction: Enforce rules:
- name: image-must-have-no-critical match: { resources: { kinds: ["Pod"] } }
validate:
message: "Image contains critical vulnerabilities"
pattern:
metadata:
annotations:
vuln.scanner/critical: "0"
3) SBOM generation and signature (Makefile fragment)
make sbom:
cyclonedx create --output sbom.json sign:
cosign sign --key cosign.key $(IMAGE_DIGEST)
Specificity for iGaming/fintech
High-risk areas: payment gateways, payment backfix, anti-fraud, PII/PAN processing - P1/P2 priority patches.
Service windows: coordination with tournaments/promotions, pre-warm caches, canaries in low-loaded regions.
Regulatory (PCI DSS/GDPR): timeline for fixing vulnerabilities, evidence (screenshots/reports), CHD zone segmentation, encryption.
Partner integrations: require versioned SDK/clients, mandatory SCA and HMAC/mTLS on webhooks.
Common mistakes
"Scan everything - fix nothing": no owners and SLO.
Focus only on CVSS without context (exposure, EPSS, data).
Patch in the container runtime instead of rebuilding the image.
Lack of canaries/rollback plans.
Ignoring cloud/K8s misconfigs (often more critical than CVE).
No SBOM/signature - supply chain.
Implementation Roadmap
1. Inventory of assets and owners; unified register of services/images.
2. Scanner stack: SCA/SAST/DAST/Container/Cloud + secret-scan; integration into CI/CD.
3. SLO and prioritization policies: CVSS + EPSS + context; ticket templates.
4. Admission/Policy-as-Code: prohibition of critical vulnerabilities, SBOM/signatures requirement.
5. Patch processes: windows, canaries, rollbacks; autopilots for minor/patch versions.
6. Reporting and metrics: MTTP, coverage, age; weekly risk review.
7. Regular exercises: simulation of critical CVE, verification of playbooks and rollback.
Result
Mature vulnerability management is a process, not a one-time "cleanup": automatic detection, context prioritization, smooth patches through canaries/rollback, policy-as-code at the entrance to the prod, and transparent execution metrics. By securing the locks in the base images and patterns, you reduce the risk of repetition and keep the attack surface under steady control.