Release Approval Process
1) Purpose and area of responsibility
The release approval process ensures predictable and secure platform changes without violating SLO, revenue and compliance. It covers the whole path: from pull request to full promotion to prod and post-monitoring.
2) Principles
1. SLO-first: release allowed only with green SLI/no burn-rate.
2. Small lots and reversibility: canary/progressive delivery, fast rollback.
3. Policy-as-Code: gates, SoD, freeze windows and risk classes are automatically verified.
4. A single source of truth: artifacts/configs/flags - in Git, the environment is given by the GitOps-reconciler.
5. Audit and provability: WORM logs, decision trail, clear owners.
6. Security by default: secrets separately, minimal privileges, geo-gates.
7. Communications without surprises: prepared templates for internal/external updates.
3) Roles and RACI
Release Manager (RM) - pipeline owner, calendar, gates. A/R
Service Owner (SO) - domain owner, accepts risk, prepares artifacts. A/R
SRE/Platform - SLO gates, rollouts, auto rollbacks. R
QA Lead - inspection strategy, test results. R
Security/Compliance - scans, SoD, regulatory. C/A
CAB (Change Advisory Board) - Normal class solution. A
On-call IC/CL - readiness for incident and communications. R/C
Stakeholders (Biz/Support/Partners) - informing. I
4) Change classes and approval paths
Upgrade - when crossing risk boundaries (payments, RG, PII, limits).
5) Release pipeline and gates (end-to-end flow)
Stage 0. Scheduling and Calendar
Freeze-windows (holidays/matches), slot on-call and CL, readiness of status templates.
Stage 1. PR → Build
Linters/licenses, SBOM, unit/contract tests, secret scan.
Stage 2. Integration/Security
E2E (virtualized PSP/KYC providers), SAST/DAST, dependency review.
Stage 3. Staging/Rehearsal
Parity with sales, migration with reversibility, phicheflags by 5-25%, checklist "release drill."
Gate A - Quality and safety (required)
+ all tests/scans green
+ schemes/configs are valid, no "red" SLI staging
+ SoD/4-eyes for high-risk changes
Stage 4. Pre-production (canary delivery)
1-5% traffic by segment (tenant/geo/bank), runtime validators, guardrails.
Gate B - SLO/Business Gate
+ no SLO/KRI degradation (latency/error/pay)
+ no SRM/anomalies in experiment metrics
+ Comms ready: draft status/partners
Stage 5. Ramp-up → 25% → 100% (region/tenant)
Turn-based promotion with post-monitoring timers.
Stage 6. Post-monitoring (30-60 min)
Release dashboard, burn-rate, complaints/tickets, auto-closing/rollback in case of violations.
6) Automated solutions (policy-engine)
Pseudo-rules:- SLO-гейт: `deny promote if slo_red in {auth_success, bet_settle_p99}`
- PII-export: `require dual_control if config. affects == "PII_EXPORT"`
- Freeze: `deny deploy if calendar. freeze && not emergency`
- Rollback: `auto if auth_success_drop > 10% for 10m in geo=TR`
7) Release artifacts
Release Manifest (required): target, risk class, regions (tenant/region), flags, migrations, rolling plan, rollback plan, owner, on-call contacts.
Evidence Pack: test results/scans, screenshots of staging dashboards, dry-run migrations.
Comms Kit: status templates (internal/external/partners), ETA/ETR.
Backout Plan - the exact steps of the rollback and the criteria under which it is triggered.
yaml release:
id: "2025. 11. 01-payments-v42"
owner: "Payments SO"
risk_class: "normal"
scope: { tenants: ["brandA","brandB"], regions: ["EU"] }
rollout:
steps:
- { coverage: "5%", duration: "20m" }
- { coverage: "25%", duration: "40m" }
- { coverage: "100%" }
migrations:
- id: "ledger_ddl_0042"
reversible: true flags:
- id: "deposit. flow. v3"
guardrails: ["api_error_rate<1. 5%","latency_p99<2s"]
rollback:
autoIf:
- metric: "auth_success_rate"
where: "geo=TR"
condition: "drop>10% for 10m"
8) Canary/Blue-Green/Feature-Flag rolling
Canary - safe default: small coverage, segmentation by GEO/tenant/BIN.
Blue-Green - for heavy changes: route switching, quick rollback.
Flags - for behavioral features: TTL, kill-switch, guardrails, SoD.
9) Management of configs and secrets
Configs as data, circuits and validators; GitOps promotion with drift detector.
Secrets - in KMS/Secret Manager, JIT access, auditing and masking.
10) Communications and status pages
Internal: var-room/chat, on-call notification, update templates.
External: publications only through CL, pre-prepared drafts.
Partners (PSP/KYC/studios): targeted notifications when integrations are affected.
Status: The release is not an incident, but has a monitoring window with metrics.
11) Emergency releases (Emergency)
Triggers: P1 degradation, vulnerability, PII/RG risks.
Path: IC + RM solution → minimum set of gates (linter/assembly) → canary 1-2% → monitoring → promotion.
Mandatory: post-factum CAB, post-mortem ≤ D + 5, documentation of compromises.
12) Audit, SoD and Compliance
SoD/4-eyes: changes in PSP routing, bonus limits, data exports.
WORM-journal: who/what/when/why; policy versions; diff release/flags/configs.
Geo/Privacy: data and logs in the desired jurisdiction; absence of PII in artifacts.
13) Observability and post-control
Release dashboard: SLI (auth-success, bet→settle p99), error-rate, complaints, conversion, queue lags.
Alerts: burn-rate, SRM, 5xx growth, PSP degradation by banks/GEO.
Reports: CFR, MTTR release incidents, average post-monitoring time, auto-rollback rate.
14) Process KPI/KRI
Lead Time for Change (PR→prod), Change Failure Rate, MTTR release incidents.
SLO-gates pass rate, Auto-rollback rate, Freeze compliance.
Coverage of Release Drill (staging rehearsals), SoD violations (goal - 0).
Comms SLA (availability of drafts, adherence to timings).
15) Implementation Roadmap (6-10 weeks)
Ned. 1-2: define change classes, gates, and artifacts; enable linters, SBOM, secret scan; release calendar and freeze.
Ned. 3-4: GitOps for configs, canary/blue-green, SLO gates, Comms templates and var room.
Ned. 5-6: policy-engine (SoD/4-eyes, risk-rules), auto-rollback by metrics; dashboard releases.
Ned. 7-8: rehearsals (staging drills), integration with phicheflags/incident-bot, KPI/KRI reports.
Ned. 9-10: WORM audit, release DR drills, CFR optimization, role training (RM/SO/CL/IC).
16) Antipatterns
Releases without reversibility and canary → mass incidents.
Ignore SLO-gates "for the sake of deadline."
Configs/flags without circuits and TTL → "frozen" states.
Manual clicks in sales without Git/audit.
Public updates without CL role and templates.
Secrets in the repository; access without JIT and logging.
CAB as a data-free brake: decisions must be backed up by release metrics.
Total
The release approval process is an engineering and management framework that connects quality, safety and speed: policies like code, SLO gates, progressive delivery, transparent communications and provable auditing. This approach reduces CFR and MTTR, protects revenue and compliance, and allows teams to release value frequently and safely.