Diversification of providers and rails
TL; DR
One provider = one SPOF. The working model is a portfolio of rails and smart routing providers: a basic and backup provider for each critical method, an auto-feiler ≤ 10 minutes, SLA control and treasury limits. Purpose: AR↑, TtW/TtR↓, Cost/GGR↓, kontsentratsii↓ risk, while predictable UX and license compliance.
1) Why diversify
Conversion (AR/Capture): Different Aqua/PSPs show different uplift by BIN/Country/ECI.
Reliability: feilover under API/webhooks/settlement degradation.
Coverage of methods: local AWP/wallets/vouchers/bank rails.
Cost: commission competition/FX/fees, Cost/GGR optimization.
Compliance/sanctions: alternatives with regional blocks/restrictions.
Treasury: Balance prefunding on different rails, liquidity flexibility.
2) Rail map (portfolio by layer)
Cards (Visa/Mastercard/Local) - a high share of turnover, sensitive to BIN/3DS2/issuers.
A2A/Open Banking/PIX/UPI/Sofort - low cost, quick sweep, different UX.
RTP/Instant/SEPA/ACH/SWIFT - conclusions and large amounts, T + N schedules.
Wallets (Skrill/Neteller/... )/Super-apps - fast UX, limits/regionality.
Vouchers - offline/cache-to-digital, increased risks of abuse.
Crypto On/Off-ramp is global, but hedge and AML policies are required.
Rule: for each critical branch - at least 2 providers (Primary/Secondary), and on Cards - 2 + Aquaiers by region.
3) Architecture: what a multi-provider loop looks like
Recon Layer - Unified Registries, settlement↔bank Mapping
Payment Orchestrator/Router: Decides where to send the attempt (based on the rules matrix and online metrics).
Feature-flags: instant toggle switches for feilover/degradation.
Idempotency & Replay-bus: a single key to try, secure retrays.
Webhook Hub: Dedup/Retrai/Polling Backup.
Treasury Layer: Rail prefund limits, stress reserves, FX.
SLA Monitor: comparing provider metrics to our telemetry.
4) Smart-routing: strategy and signals
4. 1 Signals for provider selection
AR/Soft-decline по BIN×issuer×country×device.
Latency p95/p99, share of timeouts.
3DS friction (challenge share, abandon).
Cost (fee %/fixed, FX, spread).
Fraud/calls (chargeback/friendly share).
Time windows (night/holidays), incidents/work.
4. 2 Routing policies (example)
Performance-first: maximum AR when limiting Cost/GGR.
Cost-aware: with equal AR - towards a cheap provider.
Risk-aware: high-ticket/new users → stricter provider/flow.
Geo/BIN-affinity: whitelists of "strong" aquiers by issuer/country.
Fair-share: do not allow monoconcentration (> X% of the daily turnover on one counterparty).
5) Feilover: Rules and SLOs
Triggers: 'AR_gross↓> 3 p.p. to p7', 'Auth p95> 1. 5s`, `Webhook p95>5s`, `Success Payout↓`, `Settlement on-time<99%`.
Actions: switch to Secondary, limit retrays, pause to auto-refands/dangerous auto-payments.
SLO: auto-feiler ≤ 10 min, return of traffic share by stages (25%→50%→100%) after stabilization within N intervals.
6) Treasury and liquidity in diversification
Prefund on payout rails from both providers (rolling p95 + 20%).
StressRes in case of settlement delays at Primary.
FX/Cost: consider hidden charges/spreads when routing.
Counterparty limits: daily/weekly on balance/turnover; daytime sweeps.
7) SLAs and contracts
API Uptime/Latency, Webhook SLA, Settlement Timeliness, Report Delivery.
Service Credits for violations; termination right in systematics.
Change-notice ≥ 30 days according to schemes/registries; sandbox pilots and rollback plan.
KYC/AML/Sanctions capabilities, DPA/PCI/SOC, breach ≤ 24h.
8) Scorecard providers (score 0-5)
Solution: traffic and routing priorities - by total score with weights (for example, 40% conversion, 30% reliability, 20% finance, the rest 10%).
9) Portfolio KPI
AR_net ↑, Capture_Success ↑.
Payout Success %, TtW p95 ↓, Refund TtR p95 ↓.
Cost/GGR ↓ (rail and overall).
Concentration Risk ↓ (max provider share).
Failover Time (median/p95), Incidents/Month, Service Credits/Month.
10) Data model (showcase for routing/evaluation)
ts_utc, country, provider, rail (card/a2a/rtp/wallet/voucher/crypto),
bin, issuer_country, device_os, ticket_bucket,
auth_attempted, auth_approved, captured_tx,
latency_auth_ms_p95, webhook_delivery_sec_p95,
fees_fixed, fee_pct, fx_spread_bps,
payout_attempted, payout_success, ttw_p95_sec,
settlement_date, settlement_on_time_flag
11) SQL slices (examples)
11. 1 Scorecard by Provider
sql
WITH base AS (
SELECT provider, rail,
AVG(captured_tx::decimal / NULLIF(auth_attempted,0)) AS ar_net,
PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY latency_auth_ms_p95) AS p95_latency,
AVG(payout_success::decimal / NULLIF(payout_attempted,0)) AS payout_succ,
AVG(ttw_p95_sec) AS ttw_p95,
AVG(settlement_on_time_flag::int) AS settle_on_time,
AVG(fees_fixed + fee_pct) AS avg_cost_idx
FROM provider_daily_metrics
GROUP BY 1,2
)
SELECT FROM base ORDER BY rail, ar_net DESC;
11. 2 A/B uplift routing (PSP_A→PSP_B)
sql
SELECT rail, country, bin,
AVG(CASE WHEN route='A' THEN captured_tx::decimal/NULLIF(auth_attempted,0) END) AS ar_A,
AVG(CASE WHEN route='B' THEN captured_tx::decimal/NULLIF(auth_attempted,0) END) AS ar_B,
(AVG(CASE WHEN route='B' THEN captured_tx::decimal/NULLIF(auth_attempted,0) END)
-AVG(CASE WHEN route='A' THEN captured_tx::decimal/NULLIF(auth_attempted,0) END)) AS uplift
FROM routing_experiments
GROUP BY 1,2,3
ORDER BY uplift DESC;
11. 3 Concentration by provider
sql
SELECT date, provider,
SUM(captured_amount) AS amt,
SUM(SUM(captured_amount)) OVER (PARTITION BY date) AS amt_total,
SUM(captured_amount)::decimal / NULLIF(SUM(SUM(captured_amount)) OVER (PARTITION BY date),0) AS share
FROM provider_settled
GROUP BY 1,2
ORDER BY date DESC, share DESC;
12) Playbooks
P0: AR drop on Cards (DE/FR BIN cluster)
Actions: feilover to Aquiere _ B, raise 3DS-challenge to BIN cluster, limit retrai, enable alternative method hint.
P1: Wallet_X delayed payouts
Actions: Wallet_Y/RTP routing, top up payout-pool, prioritize VIP, status message to players.
P1: Webhook chatter at the PSP_A
Actions: switch to polling, freeze auto-refands, strengthen idempotence, reconciliation with reports.
P2: Cost/GGR growth in A2A_B
Actions: transfer low-ticket to A2A_C, request discount/credit memo on SLA, check FX/spreads.
13) Risks and how to control them
Concentration: max limit of turnover/balance share per counterparty (daily/weekly).
Operating: SPOF webhooks, no polling backup - put both.
Regulatory: local bans/limits - alternate rails by country.
Treasury: underfunding payout pools - rolling p95 + buffer.
FX/Cost: hidden fees/market impact - slippage monitoring.
Security: sanctions/AML - unified screening at the entrance and at payments.
14) Implementation: Roadmap
1. Audit of current rails and providers: metrics, incidents, cost.
2. RFP/contracts: target SLO/loans, reporting, sandbox/rollback.
3. Orchestrator/routing: rules, online signals, feature flags.
4. Treasury: prefund/StressRes limits, sweeps and FX policy.
5. Monitoring/dashboards: AR/Latency/Webhook/Settlement/Cost.
6. Feilover drills: monthly (Cards/A2A/Wallet/Payout).
7. QBR with scorecard: reprioritization/traffic share.
15) UAT Case Pack
Failover ≤ 10 min: artificially drop the PSP_A, make sure that the AR is stable on the PSP_B.
Idempotency: time-out retrays → 1 charge/1 refund.
Webhook outage: switching to polling without takes/losses.
Payout reroute: Wallet_X down → RTP/SEPA success p95 ≤ SLO.
Settlement mismatch: "Suspense" process and correct reconciliation.
Routing A/B: statistically significant uplift by BIN × GEO.
16) Frequent errors
A monoprover on a critical rail is the absence of a feilover.
"Feels" routing - without online signals and A/B checks.
There are no concentration limits and prefund - cash gaps on conclusions.
Webhook without polling reserve - event losses/doubles.
Mixing metric bases - incorrect conclusions on AR/cost.
Lack of SLA/loans is a weak motivation for the provider to correct itself.
Resume Summary
Diversification is a portfolio strategy: mix rail and providers + smart routing + automatic failover + treasury discipline + tough SLAs. Such a circuit increases conversion, reduces cost, provides resistance to incidents and regulatory shocks - and makes payment monetization predictable and manageable.