Cost architecture

1) Principles and roles

Cost as a Feature. Price is part of UX/product and architectural solutions.
Shared responsibility. Engineers, platform/DevEx, finance, product - a single feedback loop.
A single source of truth. Tag/label catalog, cost dictionary, and data sources.
Watch → Optimize → Manage loop. Built-in dashboards, automatic gates and policies.

Roles: Value Architect, FinOps Analyst, Product Owner, Platform Team.

2) Value Data Model

Unit economics:

For API: '$/1000 requests', '$/millisecond CPU', '$/GB egress'.
For data: '$/GB-month of storage', '$/request to the database', '$/million messages'.
For the user: 'CAC', 'ARPU/ARPPU', 'Gross Margin', 'LTV: CAC'.
For the stream: '$/transaction', '$/deposit', '$/test run'.

Attribution scheme (simplified):


cost_record {
ts, provider, account, region, service, usage_qty, usage_unit,
list_price, net_price, discounts,
tags: { env, team, product, feature, tenant, cost_center, pii, tier },
resource_id, allocation_keys: {req_id?, tenant_id?, dataset?}
}

Gold tags (required): 'env', 'team', 'product', 'feature', 'cost _ center', 'owner', 'pii', 'tier (hot/warm/cold)', 'region'.

3) Attribution: showback/chargeback

Showback: transparent reports on teams/features without charging internal transfers.
Chargeback: distribution by rules: direct costs to → owner; shared resources - by keys: RPS, CPU seconds, GB hours, volume of events.

Shared cluster distribution pseudocode:


cluster_cost = sum(provider_cost where resource in "k8s-node:")
weights = { service: cpu_seconds(service)/total_cpu_seconds }
for service in services:
charge[service] = direct_cost(service) + cluster_cost weights[service]

4) Policy as Code

Budget rules: limits by 'env/team/feature'; auto-alert/deploy block at predicted excess.
Label requirements: resources without mandatory tags - deny in the admission controller.
Profile limits: prohibition of large machines in 'dev', TTL on ephemeral resources, minimum reservations.

YAML sketch (administrative policy):

yaml policy: require-tags-and-limits deny_if_missing_tags: [team, product, env, cost_center, owner]
constraints:
env==dev:
max_instance_type: "c6i. large"
ttl_hours: 72

5) Computing: Cost Reduction Patterns

Correct size (rightsizing): auto-matching vCPU/RAM based on p95/p99, seasonality and headroom.
Auto-scaling: target-based (CPU/RPS/lag), step functions; protection against thrash through hysteresis.
Price model selection: on-demand vs spot/preemptible, Reserved Instances/Savings Plans; mixture for critical and backgrounds.
Batch pipelines: windows of "cheap" load, batch compression, priority queues.
Caching and coalessing requests: reducing readings from expensive sources.
Edge/network optimization: HTTP/2/3, keep-alive, compression, CDN.

An example of a "step-up" autoscale (pseudo):


if rps > target1. 2 for 3m: replicas += ceil(rps/target); cool_down 5m if rps < target0. 6 for 10m: replicas = max(min_replicas, replicas-1)

6) Storage and data: hot/warm/cold

Tearing: hot data (instant access), warm (rare requests), cold/archive.
Formats: column (Parquet/ORC) for analytics, compression and partitioning by date/key.
TTL/ILM: set life policy: 'hot 7d → warm 90d → cold 365d → delete'.
Cache layer: Redis/Memcached with request coalescing, miss storm protection.
Quotas and request budgets: predictable limits on expensive joins/scans.

Example of an ILM profile (sketch):

yaml dataset: events_main lifecycle:
- phase: hot; duration: 7d; storage: nvme
- phase: warm; duration: 90d; storage: ssd; compress: zstd
- phase: cold; duration: 365d; storage: object; glacier: true
- phase: purge; duration: 0d

7) Network and egress

Minimize interregional traffic: local copies and edge aggregation.
CDN and caches: origin-shield, reasonable TTL, validation/disability.
Protocols: binary (gRPC) for chatting, compression only where beneficial.

Dedup events and filtering on the producer: "do not carry garbage."

8) Observability and cost of SRE

Telemetry cost cards: '$/log-GB', '$/metric-series', '$/trace'.
Sampling and aggregation: tail-based sampling, downsampling metrics, retention in importance (SLO metrics - higher priority).
Deadup of logs and "log-sanitation": prohibition of PD, reduction of phantom fields, limits on the size of the event.

9) CI/CD and test environments

Ephemeral stands with auto-TTL, environment "by PR."

Perf-smoke in PR: short runs for early valuation of the "cost of inquiry."

Cache/artifacts: container reuse, compilations.
Gates: build/deploy is rejected if the "latency price "/RPS has deteriorated relative to the baseline> X%.

10) Forecasting, budgets and anomalies

Forecasts: seasonality/trend, events (campaigns, releases), feature → value correlation.
Budgets by level: team/product/feature/tenant; escalation at 80/90/100%.
Anomalies: sudden peaks by service/region/account; automatic "bisect" and flag rollback.

Pseudo-alert budget:


if forecast(month_end_cost) > budget0. 9 and variance ↑:
alert(team_owner)
suggest: rightsizing + RI/SP coverage + ILM tighten

11) Procurement and Commerce

RI/Savings Plans/Committed Use: Cover a stable base; monitor coverage and "unutilized" percentages.
Spot/Preemptible: background tasks and tolerant workflow; checkpointing and quick restart.

Licenses and SaaS: ROI matrix, alternative benchmarking, periodic "vendor fitness review."

12) Multi-tenancy and billing

Partitioning by tenant: logical/physical separation, limits and quotas.

Tenant-aware limiters/ratecaps: prevent a "noisy neighbor."

Usage models: billing by events, RPS, data volumes; transparent metrics for clients.

13) Safety and compliance as a cost factor

Crypto and storage: FPE/keys - KMS/HSM costs; Optimize frequency of operations.
Regulatory copies: separate "legal" retentions from operating ones; archive is cheaper than "eternal warm" storage.
Data minimization: less data - less bills and risks.

14) Engineering anti-patterns (expensive!)

Chat APIs without batches and caching.
Unlimited queues and unlimited parallelism - the growth of latency and counting.
Zero TTLs and hot keys without coalessing.
"All-seeing" dashboards with millions of series metrics.
Resources without tags → "gray" spending without an owner.
Lack of ILM/TTL → forever storage growth.

15) Tools and artifacts (vendor-neutral)

Tag directory (schema + linter in CI).
Cost extractor (aggregation usage/billing, normalization into a single format).
Dashboards unit economics (API-cost, dataset-cost, tenant-cost).
Auto-edits (rightsizer, RI/SP-recommendation, ILM-enforcer).
Cost policies (admission/OPA/Kyverno) and budget red lines.

16) Mini recipes

"Request price" formula (HTTP)


request_cost = (cpu_ms $/cpu_ms) +
(mem_mb_s $/mb_s) +
(egress_mb $/mb) +
(db_calls $/call) +
(cache_ops $/op miss_penalty)

Quick Service Audit

Top 3 expensive endpoints by $/1000 req.
Hit/miss cache and storm keys.
Untagged resource lists.
ILM and datacet retention.
RI/SP coverage (%).

Economical retry policy


retry = min(3, floor(budget_ms / (base_timeout_ms 1. 5^attempt)))
jitter = uniform(0. 5..1. 5)

17) Value Architect Checklist

1. Defined unit metrics ('$/req', '$/GB-month', '$/txn') and owners?
2. Tag policy enforced? Are untagged resources blocked?
3. Showback/chargeback and product/feature reports implemented?
4. Autoscale and rightsizing configured, headroom defined?
5. Data toned (hot/warm/cold), ILM/TTL applied?
6. Egress and interregional flows minimized? CDN/caches enabled?
7. Observability optimized (sampling, retention, downsampling)?
8. Are CI/CD regression gates and policy-checks active?
9. Are forecasts/budgets/anomaly analysis automated?
10. RI/SP/Spot mix covers base loads?
11. Are there quotas, limits and transparent usage metrics for multi-tenant?
12. FinOps runbook and monthly cost-review plan documented?

Conclusion

Value architecture is not "saving at all costs," but value management: how much each millisecond costs and how much revenue it generates. By embedding cost in architecture, processes and tools (tags, policies, gates, dashboards, ILM, autoscale), you get a platform where decisions are made based on metrics and economics, not intuition. This speeds up the product, reduces risk and makes the business predictably profitable.

Cost architecture

Quick Service Audit

Economical retry policy

Get in Touch

Quick Contact

The video will be updated soon

We are currently very busy with projects