Network topology and routes
Brief Summary
The network is built around three pillars: topology, segmentation, routing. The modern factory is Leaf-Spine (fat-tree) with ECMP, overlay VXLAN/EVPN for L2 extensions and BGP as a "universal glue." Properly specified delay/loss SLOs, QoS, and fast-failover make behavior predictable under peak RPS.
Basic topology models
Core/Distribution/Access (Classic)
Pros: clear, good for small networks/offices.
Cons: bottleneck on Core, worse horizontal scaling.
Leaf-Spine (fat-tree, CLOS)
Spine - backbone, Leaf - torus switches for servers.
All Leaf are connected to all Spine → ECMP and predictable delay.
Scaling - adding Leaf/Spine without refactoring the address plan.
Ring/Mesh/Star
Used point (PoP, campus). For DC - limited.
Recommendation: for data centers and large sites - Leaf-Spine. For branch/office - simplified Core/Access + SD-WAN.
Segmentation and address space
VLAN - L2 segmentation (Broadcast domains).
VRF - L3 segmentation (multi-lease, dev/stg/prod).
IPAM/summarization: plan in blocks '/24 'for service/zone, aggregate to '/20' and higher for simple routing policies.
Dual-stack: IPv4 + IPv6, SLAAC/DHCPv6, RA guards, prefix policies.
Overlay/Underlay: VXLAN/EVPN
Underlay: IP factory (Leaf-Spine) with iBGP/OSPF/IS-IS.
Overlay: VXLAN carries L2 over L3; EVPN (BGP) - control plane for MAC/IP routing, multi-tenancy via VNI/VRF.
Advantages: L2-stretching without STP, fast convergences, centralized policies.
- Leaf - VTEP with loopback for VTEP-IP.
- Spine — route-reflector для EVPN.
- EVPN route types (MAC/IP, IMET, L3 interworking) provide ARP suppression and scale.
Routing Protocols and Roles
IGP (within domain)
OSPF/IS-IS: fast convergence, simple metrization. Good for underlay.
iBGP: over or without IGP (BGP-only fabric) with route-reflectors.
EGP (cross-domain)
eBGP: peering with providers/PSP/CDN, communities/LP/AS-Path policy.
Anycast: the same IP on several PoPs, routing "to the nearest" (BGP + health-check for announcements).
ECMP и fast-failover
ECMP distributes flows between equal paths.
Watch out for flow-hash (5-tuple), avoid asymmetry for stateful middleboxes.
BFD/fast-hellos for fast switching (<1 s).
Routing Policies (TE)
LocalPref/Med/AS-Path - aplink selection.
Communities - Mark traffic (prod/stg, payment PSP, CDN) for differentiated solutions.
Blackhole/Sinkhole is a fast black hole/32 for attacks.
uRPF/RTBH - anti-spoofing and remote black hole with provider.
Connectivity offices ↔ DC/Cloud
SD-WAN: dynamic channel selection (MPLS/INTERNET/LTE), encryption, per-app policies.
MPLS L3VPN: isolated VRF between sites, deterministic delay.
IPSec/GRE over IPSec/WireGuard: Fast start, but plan for MTU/Fragmentation and QoS.
NAT, CGNAT and Internet access
NAT44/NAT66 (rare) and NPTv6. For payment integrations, store source IP pools and whitelists.
egress balance: several NAT gateways per ECMP, sticky by hash.
Hairpin/Policy-Based Routing - for specific DMZ/inspection.
QoS and traffic classes
Classes: real-time (VoIP/exchange feeds), interactive (API), bulk (backups/ETL).
Marking (DSCP), policing/shaping, LLQ/WRR.
API protection/payments - a dedicated class with a guarantee of minimal delay; bulk limit in spikes.
Routing Security
BGP: TTL security, max-prefix, RPKI (route-origin validation), prefix-filters at the provider.
IGP: neighbor authentication (HMAC), management-plane isolation (OOB).
Segmentation: VRF for "payment," "operator," "public" zones; ACL between VRFs only on the desired ports.
Anycast services: health → within announcement under degradation.
Observability and SLO
SLO (examples)
Inside the data center: RTT p95 ≤ 200-300 μ s, ≤ loss 0. 01%.
Between sites (L3VPN/SD-WAN): RTT p95 ≤ X ms (according to your profile), loss ≤ 0. 1%.
Failover convergence: ≤ 1 s (IGP/BFD), ≤ 5 s (eBGP).
Metrics
'RTT ',' loss', 'jitter', 'ECMP entropy', 'BFD state', 'BGP prefixes/changes', 'CPU/TCAM' on switches, filling QoS queues.
Active probing: IP-SLA/SmokePing, QoS per class.
Flow telemetry: sFlow/NetFlow/IPFIX for traffic profiles and DDoS.
Typical configs (fragments)
FRR (BGP underlay + EVPN)
conf router bgp 65000 bgp router-id 10. 0. 0. 1 neighbor SPINE peer-group neighbor SPINE remote-as 65000 neighbor 10. 0. 0. 11 peer-group SPINE neighbor 10. 0. 0. 12 peer-group SPINE
!
address-family l2vpn evpn neighbor SPINE activate advertise-all-vni exit-address-family
!
interface lo ip address 10. 0. 0. 1/32
Linux (ECMP egress)
bash ip route add 0. 0. 0. 0/0 \
nexthop via 203. 0. 113. 1 weight 1 \
nexthop via 203. 0. 113. 2 weight 1
BFD to neighbor (Cisco-style, concept)
bfd interval 50 min_rx 50 multiplier 5 interface Po1 bfd echo ip ospf network point-to-point
Operations and DR
Change-control: phased entry (one Leaf/Spine), canary one VNI/VRF.
Auto-within: service degrades - recall Anycast-/32.
Runbooks: Spine loss, EVPN loops, ECMP path closure, aplink degradation, blackhole insert.
IPAM documentation: who owns the subnet/AS, where is the announcement, where is the NAT.
Implementation checklist
- Leaf-Spine selected, oversubscription and fat-tree width calculated.
- IPAM: summarization, reserve for growth, individual blocks for overlay loopback and.
- Underlay IGP/iBGP, BFD; Overlay EVPN/VXLAN, RR на Spine.
- VRF/ACL for zones, east-west and north-south policies.
- Egress design: NAT pools, PSP/CDN whitelists, Anycast where needed.
- QoS classes and SLO (RTT/loss/jitter), per-class monitoring.
- Detection and protection: RPKI, prefix-filters, uRPF, RTBH.
- Observability: BGP changes, BFD, IP-SLA, sFlow; dashboards/alerts.
- DR plans: Spine/link/aplinka failure, withdraw Anycast, traffic migration.
Common errors
L2 stretch without EVPN/VXLAN → STP storms and unpredictable failover.
No BFD/fast-hellos → long switchovers and application timeouts.
Manual IP plan without summarization → explosion of route tables.
Overloaded ECMP-hash → asymmetry and stateful filter problems.
Lack of RPKI/prefix-filters on eBGP → risk of hijack.
QoS "by default" → API competes with backups.
Anycast without health-driven within → black holes in partial failures.
iGaming/fintech specific
Low p95 for API/payments: dedicated QoS class, Anycast endpoints, latency-routing on DNS/GSLB.
PSP/provider whitelists: fixed egress-IP, redundant pools, fast switching.
Peak events: headroom ≥ 30% by Spine↔Leaf links, handles to turn off the bulk class.
Regulatory/PII: VRF isolation, e2e encryption, strict ACLs between zones.
Mini playbooks
1) Rapid withdraw Anycast on degradation
1. Health-check
2) Traffic transfer to the backup aplink
1. Lower the LocalPref of the main → 2) raise at the standby → 3) observe losses/RTT → 4) fix changes.
3) "Hot" factory expansion
1. Add Spine, connect all Leaf → 2) add Leaf pairs in racks → 3) iBGP/OSPF neighborhood, check ECMP entropy → 4) load transfer.
Result
The stable network is Leaf-Spine + ECMP, EVPN/VXLAN for flexible L2/L3 multi-lease, BGP policies and fast failover under metric control. Add competent IPAM, QoS, RPKI/filters, automated communication health→routing and live runbooks - and your platform will predictably deliver traffic even in the hottest hour.