VPN tunnels and channel encryption
Brief Summary
VPN (Virtual Private Network) is a collection of technologies that allow you to create a secure channel on top of an insecure network (usually the Internet). Key objectives: confidentiality (encryption), integrity (message authentication), authenticity (mutual authentication of nodes/users) and availability (resistance to failures and locks). In a corporate infrastructure, VPN closes site-to-site, remote access, cloud connectivity, and machine-to-machine scripts. Modern practice is to minimize "flat" L3 networks and apply segmentation, the principle of least privileges and a gradual transition to Zero Trust.
Basic concepts
Tunneling - encapsulation of packets from one protocol to another (for example, IP inside UDP), allowing you to "carry" a private address plan and policies through a public network.
Encryption - traffic content protection (AES-GCM, ChaCha20-Poly1305).
Authentication - authentication of nodes/users (X.509 certificates, PSK, SSH keys).
Integrity - protection against spoofing (HMAC, AEAD).
PFS (Perfect Forward Secrecy) - session keys are not extracted from long-term ones; compromising a long-term key does not disclose past sessions.
Typical scenarios
1. Site-to-Site (L3): office ↔ data center/cloud; typically a IPsec/IKEv2, static, or dynamic router.
2. Remote Access (User-to-Site): employees from laptops/mobiles; OpenVPN/WireGuard/IKEv2, MFA, split/full-tunnel.
3. Hub-and-Spoke: all branches to the central hub (on-prem or Cloud Transit).
4. Mesh: fully connected branch/microdatent network (dynamic routing + IPsec).
5. Cloud-to-Cloud: inter-cloud links (IPsec tunnels, Cloud VPN/Transit Gateway, SD-WAN).
6. Service-to-Service: machine connections between clusters/namespaces (WireGuard, IPsec in CNI/SD-WAN, mTLS at the service level).
VPN protocols and where they are strong
IPsec (ESP/IKEv2) - Site-to-Site Gold Standard
Layers: IKEv2 (key exchange), ESP (traffic encryption/authentication).
Modes: tunnel (usually), transport (rarely, host-to-host).
Pros: hardware offloads, maturity, inter-vendor compatibility, ideal for highways and cloud gateways.
Cons: configuration complexity, sensitivity to NAT (solved by NAT-T/UDP-4500), more "rituals" when coordinating policies.
Usage: branch offices, data centers, clouds, high performance requirements.
OpenVPN (TLS 1. 2/1. 3)
Layers: L4/L7, traffic over UDP/TCP; often DTLS-like scheme over UDP.
Pros: flexible, passes NAT and DPI well with masking skills (tcp/443), rich ecosystem.
Cons: Higher overhead than IPsec/WireGuard; need neat crypto configuration.
Use: remote access, mixed environments, when the "penetration" of the network is important.
WireGuard (NoiseIK)
Layers: L3 over UDP; minimalistic code base, modern crypto primitives (Curve25519, ChaCha20-Poly1305).
Pros: high performance (especially on mobiles/ARM), simplicity of configs, fast roaming.
Cons: no built-in PKI; key/identity management requires processes around.
Use: remote access, inter-cluster connectivity, S2S in the modern stack, DevOps.
SSH tunnels (L7)
Типы: Local/Remote/Dynamic (SOCKS).
Pros: "pocket" tool for point access/admin panel.
Cons: not scalable as a corporate VPN, key management and auditing are more difficult.
Use: point access to services, "periscope" to a closed network, jump-host.
GRE/L2TP/… (encapsulation without encryption)
Purpose: Creates a L2/L3 tunnel but does not encrypt. Typically combined with IPsec (L2TP over IPsec/GRE over IPsec).
Usage: rare cases when the L2 nature of the channel is needed (old protocols/isolated VLANs over L3).
Cryptography and Settings
Ciphers: AES-GCM-128/256 (hardware acceleration, AES-NI), ChaCha20-Poly1305 (mobile/without AES-NI).
CEC/groups: ECDH (Curve25519, secp256r1), groups DH ≥ 2048; Enable PFS.
Signatures/PKI: ECDSA/Ed25519 preferred; automate release/rotation, use OCSP/CRL.
Key lifetimes: short IKE SA/Child SA, regular rekey (e.g. 8-24 h, in traffic/time).
MFA: for user VPNs - TOTP/WebAuthn/Push.
Performance and reliability
MTU/MSS: correct PMTU configuration (typically 1380-1420 for UDP tunnels) MSS-clamp on edge nodes.
DPD/MOBIKE/Keepalive: operational detection of "fallen" peers, uninterrupted roaming (IKEv2 MOBIKE, WireGuard PersistentKeepalive).
Routing: ECMP/Multipath, BGP over tunnels for dynamics.
Offload: hardware crypto accelerators, SmartNIC/DPU, Linux kernel (xfrm, WireGuard kernel).
Breakthrough locks: change of ports/transports, obfuscation of a handshake (where legally permissible).
QoS: traffic classification and priority, jitter control for real-time flows.
Topologies and design
Full-tunnel vs Split-tunnel:- Full: all traffic over VPN (control/security is higher, load is higher).
- Split: only the subnets you need (savings, less latency, increased requirements for the protection of "bypass" channels).
- Segmentation: individual tunnels/VRF/policies for environments (Prod/Stage), data domains (PII/financial), providers.
- Clouds: Cloud VPN/Transit Gateways (AWS/GCP/Azure), IPsec S2S, routing through a centralized transit hub.
- SD-WAN/SASE: overlays with automatic channel selection, built-in telemetry and security policies.
Channel and Environment Security
Firewall/ACL: explicit allow-lists by port/subnet, deny by default.
DNS security: forced corporate DNS through the tunnel, protection against leaks (IPv6, WebRTC).
Client policies: kill-switch (traffic block when the tunnel falls), split-DNS prohibition when requiring compliance.
Logs and Audits: Centralize logs of handshakes, authentication, rekey rejected by SA.
Secrets: HSM/vendor KMS, rotation, PSK minimization (preferably certificates or WG keys).
Devices: compliance check (OS, patches, disk encryption, EDR), NAC/MDM.
Observability, SLO/SLA and alerting
Key metrics:- Tunnel availability (% uptime).
- Latency, jitter, packet loss on key routes.
- Bandwidth (p95/p99), CPU/IRQ of crypto nodes.
- Rate of rekey/DPD events, authentication failures.
- Fragmentation/PMTU errors.
- "VPN hub availability ≥ 99. 95% per month"
- "p95 delay between DC-A and DC-B ≤ 35 ms."
- «< 0. 1% of failed IKE SAs per hour.
- Tunnel Down> X sec; DPD surge; growth of handshake errors; p95> threshold degradation; CRL/OCSP errors.
Operations and Life Cycle
PKI/certificates: automatic release/update, short TTL, revoke immediately if compromised.
Key rotation: regular, with phased switching of peers.
Changes: change plans with rollback (old/new SA in parallel), maintenance windows.
Break-glass: spare accounts/keys, documented manual access via jump-host.
Incidents: in case of suspicion of compromise - revocation of certificates, PSK rotation, force-rekey, change of ports/addresses, audit of logs.
Compliance and Legal
GDPR/PII: encryption in transit is mandatory, minimizing access, segmentation.
PCI DSS: strong ciphers, MFA, access logs, cardholder segmentation.
Local traffic/crypto restrictions: comply with jurisdictional requirements (export of crypto, DPI, blocking).
Logs: storage according to policy (retention, integrity, access).
Zero Trust, SDP/ZTNA vs classic VPN
Classic VPN: distributes network access (often wide).
ZTNA/SDP: gives access to a specific application/service after contextual verification (identity, device status, risk).
Hybrid model: leave VPN for highways/S2S, and for users - ZTNA tile to the desired applications; gradually remove "flat" sets.
How to choose a protocol (short matrix)
Between branches/clouds: IPsec/IKEv2.
Remote access to users: WireGuard (if you need a light and fast client) or OpenVPN/IKEv2 (if you need a mature PKI/policies).
High penetration through proxy/DPI: OpenVPN-TCP/443 (with awareness of invoices) or obfuscation (where allowed).
Mobile/roaming: WireGuard or MOBIKE IKEv2.
L2 over L3: GRE/L2TP with IPsec (encryption required).
Implementation checklist
1. Define access domains (Prod/Stage/Back-office) and the principle of minimum privileges.
2. Select the protocol/topology (hub-and-spoke vs mesh), plan addressing and routing.
3. Approve crypto profile (AES-GCM/ChaCha20, ECDH, PFS, short TTL).
4. Set up PKI, MFA, due date and release policy.
5. Configure MTU/MSS, DPD/MOBIKE, keepalive.
6. Enable logging, dashboards, SLO metrics and alerts.
7. Carry out load/feiler testing (fall of the hub, rekey-bursts, link change).
8. Document break-glass and rotation procedure.
9. Conduct training onboarding of users (clients, policies).
10. Regularly review access and audit reports.
Common mistakes and how to avoid them
L2TP/GRE without IPsec: no encryption → always add IPsec.
Incorrect MTU: fragmentation/drops → configure MSS-clamp, check PMTU.
PSK "forever": outdated keys → rotation, transition to certificates/Ed25519.
Wide networks in split-tunnel: traffic leaks → clear routes/policies, DNS only via VPN.
Single "super hub" without redundancy: SPOF → asset-asset, ECMP, several regions.
No handshake monitoring: "silent" falls → DPD/alarms/deshboards.
Sample Configurations
WireGuard (Linux) — `wg0. conf`
ini
[Interface]
Address = 10. 20. 0. 1/24
PrivateKey = <server_private_key>
ListenPort = 51820
Client 1
[Peer]
PublicKey = <client1_public_key>
AllowedIPs = 10. 20. 0. 10/32
PersistentKeepalive = 25
Customer:
ini
[Interface]
Address = 10. 20. 0. 10/32
PrivateKey = <client_private_key>
DNS = 10. 20. 0. 2
[Peer]
PublicKey = <server_public_key>
Endpoint = vpn. example. com:51820
AllowedIPs = 10. 20. 0. 0/24, 10. 10. 0. 0/16
PersistentKeepalive = 25
strongSwan (IPsec/IKEv2) — `ipsec. conf`
conf config setup uniqueids=never
conn s2s keyexchange=ikev2 ike=aes256gcm16-prfsha384-ecp256!
esp=aes256gcm16-ecp256!
left=%any leftid=@siteA leftsubnet=10. 1. 0. 0/16 right=vpn. remote. example rightsubnet=10. 2. 0. 0/16 dpdaction=restart dpddelay=30s rekey=yes auto=start
`ipsec. secrets`:
conf
: RSA siteA. key
OpenVPN (UDP, TLS 1. 3) — `server. conf`
conf port 1194 proto udp dev tun tls-version-min 1. 3 cipher AES-256-GCM data-ciphers AES-256-GCM:CHACHA20-POLY1305 auth SHA256 user nobody group nogroup topology subnet server 10. 30. 0. 0 255. 255. 255. 0 push "redirect-gateway def1"
push "dhcp-option DNS 10. 30. 0. 2"
keepalive 10 60 persist-key persist-tun verb 3
Practice for iGaming/fintech platforms
Segmentation: separate tunnels for payment integrations, back office, content providers, anti-fraud; isolate PII/payment domains.
Hard access policies: machine-to-machine by specific ports/subnets (allow-list by PSP, regulators).
Observability: p95 Time-to-Wallet may degrade due to VPN incidents - monitor connectivity to critical PSP/banks.
Compliance: store access logs and authentications, implement MFA, regular channel penetration tests.
FAQ
Is it possible to do full-mesh between all branches?
Only if there is automation and dynamic routing; otherwise - an increase in complexity. Often more profitable hub-and-spoke + local exceptions.
Do I need to encrypt "internal" traffic between the clouds?
Yes I did. Public backends and interregional highways require IPsec/WireGuard and strict ACLs.
Which is faster - AES-GCM or ChaCha20-Poly1305?
On x86 with AES-NI - AES-GCM; ChaCha20-Poly1305 often wins on ARM/mobiles.
When to switch to ZTNA?
When network access via VPN has become "wide," and applications can be published pointwise with context authentication and device verification.
Total
A reliable VPN architecture is not only "protocol and port." This is a crypto profile with PFS, thoughtful segmentation, observability with hard SLOs, PKI/rotation discipline, and managed transition to ZTNA where network access is redundant. By following the checklist and selection matrix above, you will build robust and manageable connectivity for today's distributed systems.