Manage configurations and secrets

1) Why do you need it

Configurations and secrets are the "blood" of the production platform. An error in the config falls into p95, the leaked secret is a P1 incident. The goal is to make a config/secret:

Predictable (schemes, validation, versions).
Secure (encryption, minimum rights, rotation).
Managed (GitOps, audit, rollbacks).
Dynamic where it is justified (feature flags, parameterization of limits).

2) Classification of artifacts

Public configs: features, thresholds, timeouts, feature flags (no secrets).
Sensitive configs: parameters that change the behavior of critical paths (for example, payment limits).
Secrets: passwords/keys/tokens/certificates/encryption materials.
Trust artifacts: root/intermediate certificates, PKI policies, KMS keys.

The principle of separate storage and rights: public ≠ sensitive ≠ secrets.

3) Configuration hierarchy

Build a "pyramid" of layers:

1. Global defaults (org-wide).

2. Environment (`prod/stage/dev`).

3. Region (`eu-central-1`, `us-east-1`).

4. Tenant/Brand (for multi-tenants).

5. Service (specific microservice).

6. Override (runtime) - temporary switches.

Merger rules: "below wins," conflict - only through MR/approval.

Example (YAML)

yaml defaults:
http:
timeout_ms: 800 retry: 2 prod:
http:
timeout_ms: 1200 service: payments-api overrides:
eu-central-1:
http:
timeout_ms: 1500

4) Schemes and validation

Each config is a contract with a scheme (JSON Schema/OPA/validators in CI).

Types, ranges, required fields, default values.
"Guard rules" (cannot be set to 'retry> 5', 'p95 _ target <50ms').
Automatic check in CI and when applied (admission-webhook/KRM).

JSON Schema Fragment

json
{
"type":"object",
"properties":{
"http":{"type":"object","properties":{"timeout_ms":{"type":"integer","minimum":100,"maximum":10000},"retry":{"type":"integer","minimum":0,"maximum":5}},"required":["timeout_ms"]},
"feature_flags":{"type":"object","additionalProperties":{"type":"boolean"}}
},
"required":["http"]
}

5) Config Delivery Models

Static (image-baked): reliable, but requires restarts.
Push/Watch :/sidecar agents receive updates (stream/poll) and signal the application.
Pull on startup: we get a snapshot at startup (simplify hot-path).
Edge cache/proxy for geo-distributed loads.

The main thing: atomicity and versioning of snapshots, compatibility control and fast rollback.

6) Tools and roles

Config stores: Git (source of truth) + GitOps (Argo/Flux), Parameter Store/Config Service.
Secret repositories: Vault, AWS Secrets Manager/SSM, GCP Secrets, Azure KV.
Encryption: KMS/HSM, SOPS (age/GPG/KMS), Sealed Secrets, Transit encryption (Vault).
Delivery: CSI Secrets Store, Vault Agent Injector/Sidecar, init-containers.
Flags/dynamics: feature flag platform (incl. emergency kill-switch).

7) Encryption: Models and Practices

At rest: KMS keys of the project/environment, envelope encryption.
In transit: TLS/mTLS with mutual authentication.
At use: decryption as late as possible, preferably in the process memory/sidecar (without writing to disk).
Key hierarchy: root (HSM) → KMS CMK → data keys (DEK).
Rotation: calendar (90/180 days) + by event (employee compromise/departure).

8) Secret Management: Patterns

8. 1 GitOps + SOPS (static snapshot)

Git only stores ciphertext.
Decryption in CI/CD or on a cluster (KMS/age).
Application via controller (Flux/Argo) → Kubernetes Secret.

yaml apiVersion: v1 kind: Secret metadata: { name: psp-keys, namespace: payments }
type: Opaque data:
apiKey: ENC[AES256_GCM,data:...,sops]

8. 2 Vault Agent Injector

The service account (JWT/SA) is authenticated in Vault.
Sidecar puts credits in tmpfs and updates on TTL.
Support for dynamic credits (DB, cloud - isolation and short term).

yaml annotations:
vault. hashicorp. com/agent-inject: "true"
vault. hashicorp. com/role: "payments-api"
vault. hashicorp. com/agent-inject-secret-db: "database/creds/payments"

8. 3 CSI Secrets Store

Mount the secret as volume, rotation is transparent.
For PKI - automatic renewal of certificates/keys.

9) Kubernetes: practicalities

ConfigMap - public/insensitive data only.
Secret - sensitive (with base64 - not encryption; enable Encryption at Rest for etcd).
Checksum annotations: restart Deployment when changing the config.
Admission control: prohibition of mounting secrets not from the "white list," prohibition of "plain" passwords in manifests.
NetworkPolicy: restrict access to secret providers (Vault/CSI).

Checksum example (Helm)

yaml annotations:
checksum/config: {{ include (print $.Template. BasePath "/configmap. yaml"). sha256sum }}

10) Access Policies (RBAC/ABAC)

Least privilege: the service only sees its secrets; access by namespace/label/prefix.
Split duties: creating a secret ≠ reading content; audit any reads.
Temporary credits: dynamic logins (DB, cloud) with TTL and automatic rotation.
Segmentation: prod/stage in different projects/accounts/KMS keys.

11) Audit, logging, observability

Logs of reading/issuing secrets: who/when/what/where; correlation with releases and incidents.
Metrics: frequency of calls, expired secrets, expired certificates, share of dynamic credits.
Security events - quota exceeded, IP/time anomalies, multiple failed authentications.

12) Rotation of secrets and certificates

Standardize the terms: API keys - 90 days, DB passwords - 30 days, TLS serts - 60-90 days.
Rotation outline: generation → test → double publication (grace) → switching → revocation of old → verification.
Reliability: double entry of configs/secrets, client compatibility (accept new + old).
PKI: own CA or integration with an external one; Automatically update mTLS content through CSI/Vault.

13) Dynamic configs and feature flags

Take "hot" parameters (limits, timeouts) from the config service/flag platform.
Local cache and stickiness (calculation of the variant by hash), short TTL.
SLO guards to change sensitive parameters (auto-rollback and kill-switch).

14) Integration with CI/CD and GitOps

Pre-commit/CI: circuit linters, SOPS checks, prohibition of "naked" secrets (scanners: gitleaks/trufflehog).
Policy Gate: OPA/Conftest - disallow configs without schema/without owner annotations/without environment labels.
Progressive delivery: promotion of configs as artifacts (semver), canary for changing parameters.
Release annotations: who/what config/secret changed; fast correlation with p95/5xx.

15) Examples

15. 1 OPA Policy: Banning Open SGs in Config

rego package policy. config

deny[msg] {
input. kind == "SecurityGroupRule"
input. cidr == "0. 0. 0. 0/0"
input. port = = 5432 msg: = "Postgres open internet banned"
}

15. 2 Example of a config snapshot (versioned)

yaml version: 1. 12. 0 owner: payments-team appliesTo: [ "payments-api@prod" ]
http:
timeout_ms: 1200 retry: 2 withdraw:
limits:
per_txn_eur: 5000 per_day_eur: 20000 flags:
new_withdrawal_flow: false

15. 3 Vault - dynamic database credits

hcl path "database/creds/payments" {
capabilities = ["read"]
}
role issues user/password with TTL = 1h and auto-rollover

16) Anti-patterns

Secrets in Git in clear text/in Helm/Ansible variables without encryption.
A single "mega-secret" for all services/environments.
Long-lived tokens without TTL/rotation; "immortal" certificates.
Dynamic configs without schemes/validation and without audit changes.
No Encryption at Rest for etcd/KMS and non-mTLS network.
Manual editing of configs in the product (bypassing GitOps).

Access to developers to trade secrets "just in case."

17) Implementation checklist (0-60 days)

0-15 days

Include diagrams/validators for configs; start repo "configs" and GitOps stream.
Raise KMS and encryption: SOPS/Sealed Secrets/Encryption at Rest in etcd.
Prohibit plaintext secrets in CI (scanners), enter owners/approvals.

16-30 days

Divide vaults: public configs vs sensitive vs secrets.
Implement Vault/Secrets Manager, select the delivery path (Agent/CSI/SOPS).

Set up the rotation of TLS/DB/PSP credits; dashboards "life span/expiring."

31-60 days

Dynamic configs and flags with SLO-gating and auto-rollback.
OPA/Conftest policies; zero-trust (namespace/label-scoped access).
Game-day: simulation of secret leak and force rotation.

18) Maturity metrics

% of secrets under encryption and without direct access from Git = 100%.
The configuration/validation coverage ≥ 95%.
Average time to rotate critical secrets (target: hours, not days).
The share of dynamic credits (DB/cloud) ≥ 80%.
0 incidents due to "plain secrets "/expired certificates.
MTTR on config error with rollback <5 minutes.

19) Command roles and processes

Config Owner: Domain/Schema/Policy owner.
Security: policies, key hierarchy, access audit.
Platform/SRE: GitOps, supply/injection, telemetry.
App Teams: config/secret consumption, compatibility tests.

20) Conclusion

A reliable contour of configurations and secrets are + GitOps + encryption + rotation + policy schemes. Separate public and secret, encrypt everything, apply configs atomically and versionally, minimize rights and lifetime of credits, automate rotations and audits. Then the changes will become fast and safe, and the risk of leaks and falls will be minimal.

Manage configurations and secrets