VPC Peering and Routing
1) Why Peering and when is it appropriate
VPC/VNet Peering combines the provider's private networks into a single point-to-point address space with private traffic (no Internet and no NAT between peers). Typical cases:- separation of environments and domains (prod/stage/dev) with common private connectivity;
- bringing out common platforms (logging, KMS/Vault, artifacts) in a shared network;
- access from applications to managed PaaS via private paths (via hubs/endpoints).
When it is better not to peering, but a hub: more than 10-20 networks, the need for transit routing, centralized egress, inter-cloud communications → use the Transit Gateway/Virtual WAN/Cloud Router.
2) Models and constraints
2. 1 Types of peering
Intra-region peering - within the region, minimal delays and cost.
Inter-region peering - between regions, interregional traffic is usually paid.
Cross-project/account - peering between different accounts/projects (with delegation).
2. 2 Transit and NAT
Classic VPC/VNet Peering is not transitive: the A↔B and B↔C network does not mean A↔C.
NAT via intermediate network for transit - anti-pattern (breaks source IP, complex audit).
For transit - hub bus: AWS Transit Gateway (TGW), Azure Virtual WAN/Hub, GCP Cloud Router/HA VPN/Peering Router.
2. 3 Overlapping CIDR
Peering does not support intersecting prefixes. If crossings are unavoidable, apply:- Address Replan (best option);
- NAT domains/Proxy VPC with one-sided schemes (taking into account auditing and logging);
- For specific PaaS - PrivateLink/PSC without L3 access.
3) Addressing and route design
3. 1 CIDR Planning
A single supernet (for example, '10. 0. 0. 0/8 ') → divide by' region/env/vpc '.
Reserve ranges for future VPCs/growth-buffers.
IPv6 - plan ahead: '/56 'on VPC, '/64' on subnet.
3. 2 Routing
Route tables: explicit routes on peer/hub on each VPC/subnet.
Priorities: The more specific prefix wins; avoid catch-all through peering.
Blackhole protection: Mark and clean duplicate/obsolete routes.
3. 3 Domains and roles
Spoke (applications) ↔ Hub (common services, egress, inspection).
Feasts only spoke↔hub; spoke↔spoke - through the hub (segmentation and control).
4) Topology patterns
4. 1 "Simple" mesh (≤5 VPC)
Direct pin-to-pin feasts (A↔B, A↔C...). Pros: minimum components; cons: O (N ²) links and rules.
4. 2 Hub-and-Spoke
All spoke feasts with Hub VPC/VNet; in the hub - TGW/Virtual WAN/Cloud Router, NAT/egress, inspection. Scalable, easy to manage.
4. 3 Multi-region
Local hubs in each region; between hubs - inter-region peering or backbone (TGW-to-TGW/VWAN-to-VWAN).
5) Security and segmentation
Stateful on host: SG/NSG is the main barrier; NACL/subnetwork ACL - coarse guard/deny lists.
L7 policies in mesh/proxy (Istio/Envoy/NGINX) - authorization by mTLS/JWT/claims.
Egress control: spoke should not "see" the Internet directly - only through the egress gateway/PrivateLink.
Flow Logs and Hub Inspection (GWLB, IDS/IPS) for inter-VPC traffic.
6) DNS и split-horizon
Each private zone - visibility on the desired VPCs (Private Hosted Zones/Private DNS/Zones).
For PaaS via PrivateLink/PSC - private entries to private IP endpoints.
Conditional forwarding между on-prem ↔ cloud и region ↔ region.
Naming: 'svc. env. region. internal. corp '- without PII; fix TTL (30-120s) under the feiler.
7) Observability and testing
Metrics: accepted/denied on SG/NSG, bytes per peer, RTT/jitter between regions, top-talkers.
Logs: VPC Flow Logs/NSG Flow Logs in SIEM, trace with 'trace _ id' for L7↔L3 correlation.
Reachability tests: TCP/443 synthetics/DB ports from different subnets/AZ/regions; reachability analyzer.
Chaos network: delays/losses between peer/hub; timeout/retray/idempotency check.
8) Performance and cost
Inter-region is almost always charged; read egress in advance (more expensive with logs/backups).
MTU/PMTUD: the standard MTU is within the provider, but at the boundaries (VPN, FW, NAT-T), consider MSS-clamp.
Inspection scale-up (GWLB/scale sets) without bottlenecks; ECMP for hubs.
Cache/edge and SWR reduce inter-regional traffic.
9) Cloud features and examples
9. 1 AWS (VPC Peering / Transit Gateway)
VPC Peering: create peering connection, add routes in subnet tables.
There is no transit through regular peering. For transit and centralized model - Transit Gateway.
hcl resource "aws_vpc_peering_connection" "a_b" {
vpc_id = aws_vpc. a. id peer_vpc_id = aws_vpc. b. id peer_owner_id = var. peer_account_id auto_accept = false tags = { Name = "a-b", env = var. env }
}
resource "aws_route" "a_to_b" {
route_table_id = aws_route_table. a_rt. id destination_cidr_block = aws_vpc. b. cidr_block vpc_peering_connection_id = aws_vpc_peering_connection. a_b. id
}
9. 2 Azure (VNet Peering / Virtual WAN)
VNet Peering (including global): flags Allow forwarded traffic, Use remote gateway for hub schemes.
For hubs and transit - Virtual WAN/Hub with Route Tables and Policies.
bash az network vnet peering create \
--name spokeA-to-hub --vnet-name spokeA --remote-vnet hub \
--resource-group rg --allow-vnet-access --allow-forwarded-traffic
9. 3 GCP (VPC Peering / Cloud Router)
VPC Peering without transit; for the center - Cloud Router + HA VPN/Peering Router.
Hierarchical FW для org-guardrails.
10) Kubernetes in peer-to-peer networks
Cluster in spoke, common services (logging/storage/artifacts) - in hub; access to private addresses.
NetworkPolicy "deny-all" and explicit egress on hub/PrivateLink.
Do not "carry" Pod CIDR between VPCs; route the Node CIDR and use Ingress/Gateway.
11) Trableshooting (cheat sheet)
1. CIDRs do not overlap? Check supersets/old subnets.
2. Route tables: Is there a path both ways? Is there a more specific route that intercepts traffic?
3. SG/NSG/NACL: stateful-in/out match? Does the subnet ACL block reverse traffic?
4. DNS: correct private records/forwarders? Check 'dig + short' from both networks.
5. MTU/MSS/PMTUD: is there fragmentation and silent timeouts?
6. Checking flow logs: is there a SYN/SYN-ACK/ACK? Who drops?
7. Inter-region: peering quotas/limits/organization policies/routing tags.
12) Antipatterns
A "random" mesh of dozens of peers without a hub → an explosion of difficulties and ACL passes.
Overlapping CIDR "somehow overwhelm NAT" → audit/end-to-end identification breaks.
Public egress in each spoke → uncontrolled surface and cost.
Lack of split-horizon DNS → name leaks/broken resolutions.
Wide routes' 0. 0. 0. 0/0 'over peer → unexpected traffic asymmetry.
Manual edits in the console without IaC and revision.
13) Specifics of iGaming/Finance
PCI CDE and payment circuits - only through the hub with inspection; no bypass spoke↔spoke.
Data residency: PII/transaction logs - within jurisdictions; interregionally - aggregates/anonymous.
Multi-PSP: PrivateLink/private channels to PSP, centralized egress proxy by allowlist FQDN and mTLS/HMAC.
Audit/WORM: flow logs and route changes in unchanging storage, retention according to standards.
SLO sections: per region/VPC/tenant; alerts to "egress leakage" and degradation of interregional RTTs.
14) Prod Readiness Checklist
- CIDR non-crossing plan (IPv4/IPv6), growth pools reserved.
- Hub-and-spoke topology; feasts - only spoke↔hub; transit via TGW/VWAN/Cloud Router.
- Route tables: explicit paths, no catch-all via peer, blackhole control.
- SG/NSG/NACL applied; L7 policies in mesh; egress only through the/PrivateLink hub.
- Private DNS/PHZ configured; conditional-forwarders между on-prem/cloud/regions.
- Flow Logs enabled; dashboards by peer/region; reachability synthetics and PMTUD tests.
- IaC (Terraform/CLI) and Policy-as-Code (OPA/Conftest) for rules/routes/DNS.
- Documented runbook 'and (add peer, roll out routes, disable spoke).
- Exercises: disabling the hub/feast, measuring the actual RTO/RPO of network paths.
- For iGaming/Finance: PCI isolation, PrivateLink to PSP, WORM audit, SLO/alerts by jurisdiction.
15) TL; DR
Use VPC/VNet Peering for simple point-to-point private connectivity, but don't rely on it for transit - it needs a hub (TGW/VWAN/Cloud Router). Plan CIDR without intersections, keep routes explicit and specific, apply stateful SG/NSG and L7 policies in mesh, DNS - split-horizon. Enable flow logs, synthetics, and PMTUD checks. For iGaming/finance - PCI isolation, private channels to PSP and unchangeable audit.