resist-vpn-infra/docs/TWO_TIER_DEPLOYMENT.md
2026-01-26 21:22:41 -05:00

492 lines
9.7 KiB
Markdown

---
# Two-Tier VPN Architecture Deployment Guide
## Architecture Overview
This Ansible collection is designed for a **two-tier VPN architecture**:
### Tier 1: Admin Control Plane (ValleyForge)
- **WireGuard admin VPN** (10.100.0.0/24)
- **Ansible control node**
- **GitHub Actions runner** (future)
- **2-5 admin users**
### Tier 2: User Data Plane (VPN1/VPN2/VPN3)
- **User-facing VPN endpoints** (Algo/Outline)
- **50-70 users per endpoint** (200 total)
- **Separate VPN networks**:
- VPN1: 10.200.0.0/24
- VPN2: 10.201.0.0/24
- VPN3: 10.202.0.0/24
---
## Prerequisites
### Before You Start
1. **ValleyForge deployed** with WireGuard admin VPN
2. **Ansible installed** on ValleyForge
3. **SSH access** from ValleyForge to VPN1/VPN2/VPN3
4. **ValleyForge public IP** known
---
## Step 1: Configure Inventory
### Edit hosts.yml
On ValleyForge:
```bash
cd /root/ansible/secure_vpn_server
nano inventory/hosts.yml
```
**Set your VPN endpoint IPs**:
```yaml
vpn_servers:
hosts:
vpn1:
ansible_host: 203.0.113.10 # Your VPN1 public IP
vpn2:
ansible_host: 203.0.113.11 # Your VPN2 public IP
vpn3:
ansible_host: 203.0.113.12 # Your VPN3 public IP
vars:
valleyforge_public_ip: "185.112.147.205" # Your ValleyForge public IP
```
---
## Step 2: Configure Variables
### Edit group_vars/vpn_servers.yml
```bash
nano inventory/group_vars/vpn_servers.yml
```
**CRITICAL: Set management_allowed_sources**:
```yaml
# Allow management from ValleyForge
management_allowed_sources:
- "185.112.147.205" # Your ValleyForge public IP
# Or if you have VPN routing configured:
# management_allowed_sources:
# - "10.100.0.0/24" # ValleyForge admin VPN network
```
**Configure users**:
```yaml
wg_peers:
- name: user1
- name: user2
- name: user3
# Add up to 70 users per endpoint
```
### Verify host_vars
Check that each VPN endpoint has unique networks:
```bash
cat inventory/host_vars/vpn1.yml
# wg_network: "10.200.0.0/24"
cat inventory/host_vars/vpn2.yml
# wg_network: "10.201.0.0/24"
cat inventory/host_vars/vpn3.yml
# wg_network: "10.202.0.0/24"
```
---
## Step 3: Validate Configuration
**Run validation playbook**:
```bash
ansible-playbook -i inventory/hosts.yml playbooks/validate.yml
```
**Expected output**:
```
TASK [Validate management_allowed_sources is defined]
ok: [vpn1] => {
"msg": "✓ management_allowed_sources is configured"
}
TASK [Validate ValleyForge IP is set]
ok: [vpn1] => {
"msg": "✓ ValleyForge IP is configured: 185.112.147.205"
}
TASK [Display configuration summary]
ok: [vpn1] => {
"msg": [
"Host: vpn1",
"VPN Network: 10.200.0.0/24",
"Management allowed from: 185.112.147.205",
"Users configured: 3"
]
}
```
**If validation fails**:
- Check that `management_allowed_sources` is set in `group_vars/vpn_servers.yml`
- Check that `valleyforge_public_ip` is set in `inventory/hosts.yml`
- Check that each host has unique `wg_network` in `host_vars/`
---
## Step 4: Test SSH Access
**From ValleyForge, test SSH to each endpoint**:
```bash
ssh root@203.0.113.10 # VPN1
ssh root@203.0.113.11 # VPN2
ssh root@203.0.113.12 # VPN3
```
**If SSH fails**:
```bash
# Generate SSH key on ValleyForge
ssh-keygen -t ed25519
# Copy to VPN endpoints
ssh-copy-id root@203.0.113.10
ssh-copy-id root@203.0.113.11
ssh-copy-id root@203.0.113.12
```
---
## Step 5: Deploy (Dry Run)
**Test deployment without making changes**:
```bash
ansible-playbook -i inventory/hosts.yml playbooks/site.yml --check --diff
```
**Review output for errors**:
- Syntax errors
- Missing variables
- Connection issues
---
## Step 6: Deploy to Single Endpoint (Test)
**Deploy to VPN1 only**:
```bash
ansible-playbook -i inventory/hosts.yml playbooks/site.yml --limit vpn1
```
**Monitor deployment** (~10-15 minutes):
- System hardening
- WireGuard installation
- Firewall configuration
**Expected final output**:
```
TASK [Display deployment summary]
ok: [vpn1] => {
"msg": [
"=========================================",
"Deployment Complete!",
"=========================================",
"Server: vpn1",
"Public IP: 203.0.113.10",
"VPN Network: 10.200.0.0/24",
"Client configs: /root/wireguard-client-configs/",
"Firewall config: /root/firewall-config.txt"
]
}
```
---
## Step 7: Verify VPN1 Deployment
### Check Services
```bash
# SSH to VPN1 (should still work from ValleyForge)
ssh root@203.0.113.10
# Check WireGuard
sudo wg show
# Should show wg0 interface
# Check firewall
sudo ufw status verbose
# Should show:
# - Port 51820/udp ALLOW from Anywhere (user VPN)
# - Port 22/tcp ALLOW from 185.112.147.205 (ValleyForge)
# - Port 22/tcp DENY from Anywhere (default deny)
# Check services
systemctl status wg-quick@wg0
systemctl status fail2ban
systemctl status sshd
# Exit
exit
```
### Test Management Access
**From ValleyForge**:
```bash
# SSH should work (you're from allowed source)
ssh root@203.0.113.10
# Connected!
```
**From your local machine** (NOT ValleyForge):
```bash
# SSH should be BLOCKED
ssh root@203.0.113.10
# Connection refused or timeout
```
**This is correct!** Management is restricted to ValleyForge.
---
## Step 8: Retrieve User Configs
**From ValleyForge**:
```bash
# Download VPN1 user configs
scp -r root@203.0.113.10:/root/wireguard-client-configs/ /root/vpn1-configs/
# Check configs
ls /root/vpn1-configs/
# user1.conf user1_qr.txt
# user2.conf user2_qr.txt
# user3.conf user3_qr.txt
```
---
## Step 9: Deploy to All Endpoints
**If VPN1 test was successful**:
```bash
# Deploy to VPN2 and VPN3
ansible-playbook -i inventory/hosts.yml playbooks/site.yml --limit vpn2,vpn3
# Or deploy to all at once
ansible-playbook -i inventory/hosts.yml playbooks/site.yml
```
---
## Step 10: Retrieve All User Configs
```bash
# Download from all endpoints
scp -r root@203.0.113.10:/root/wireguard-client-configs/ /root/vpn1-configs/
scp -r root@203.0.113.11:/root/wireguard-client-configs/ /root/vpn2-configs/
scp -r root@203.0.113.12:/root/wireguard-client-configs/ /root/vpn3-configs/
```
---
## Firewall Rules Explained
### What Gets Configured
**On each VPN endpoint (VPN1/VPN2/VPN3)**:
```
# Public ports (user VPN)
Port 51820/udp → ALLOW from Anywhere
# Management ports (restricted to ValleyForge)
Port 22/tcp → ALLOW from 185.112.147.205
Port 22/tcp → DENY from Anywhere
# Default policy
Incoming → DENY
Outgoing → ALLOW
```
### Why This Works
1. **User VPN access**: Port 51820 is open to internet for end users
2. **Management access**: SSH only from ValleyForge public IP
3. **Security**: All other management blocked from internet
### Access Matrix
| Source | Destination | Port | Result |
|--------|-------------|------|--------|
| Internet | VPN1/2/3 | 51820 (user VPN) | ✅ ALLOWED |
| ValleyForge | VPN1/2/3 | 22 (SSH) | ✅ ALLOWED |
| Internet | VPN1/2/3 | 22 (SSH) | ❌ BLOCKED |
| Internet | VPN1/2/3 | 80/443 | ❌ BLOCKED |
---
## Troubleshooting
### Can't SSH from ValleyForge After Deployment
**Problem**: SSH connection refused from ValleyForge
**Solution**:
```bash
# Use VPS console/VNC to access VPN endpoint
# Check firewall rules
sudo ufw status verbose
# Check if ValleyForge IP is allowed
sudo ufw status | grep 185.112.147.205
# If not, add it
sudo ufw allow from 185.112.147.205 to any port 22
# Or temporarily disable firewall
sudo ufw disable
```
### Wrong ValleyForge IP in Firewall
**Problem**: Set wrong IP in `management_allowed_sources`
**Solution**:
```bash
# On ValleyForge, update group_vars
nano inventory/group_vars/vpn_servers.yml
# Fix the IP
management_allowed_sources:
- "CORRECT.IP.ADDRESS.HERE"
# Re-deploy firewall only
ansible-playbook -i inventory/hosts.yml playbooks/firewall.yml
```
### Users Can't Connect to VPN
**Problem**: User VPN port not accessible
**Solution**:
```bash
# Check firewall on VPN endpoint
ssh root@vpn-ip # From ValleyForge
sudo ufw status | grep 51820
# Should show:
# 51820/udp ALLOW Anywhere
# If not, add it
sudo ufw allow 51820/udp
# Or re-deploy
ansible-playbook -i inventory/hosts.yml playbooks/firewall.yml
```
### Validation Playbook Fails
**Problem**: `management_allowed_sources` not defined
**Solution**:
```bash
# Edit group_vars
nano inventory/group_vars/vpn_servers.yml
# Add this section
management_allowed_sources:
- "YOUR_VALLEYFORGE_IP"
# Re-run validation
ansible-playbook -i inventory/hosts.yml playbooks/validate.yml
```
---
## Adding Users
### Add Users to Existing Endpoint
```bash
# On ValleyForge
cd /root/ansible/secure_vpn_server
# Edit group_vars or host_vars
nano inventory/group_vars/vpn_servers.yml
# Add user
wg_peers:
- name: user1
- name: user2
- name: new_user4 # Add this
# Re-deploy WireGuard only
ansible-playbook -i inventory/hosts.yml playbooks/wireguard.yml --limit vpn1
# Retrieve new config
scp root@203.0.113.10:/root/wireguard-client-configs/new_user4.conf /root/
```
---
## Monitoring
### Check All Endpoints Status
```bash
# From ValleyForge
ansible vpn_servers -i inventory/hosts.yml -m shell -a "wg show"
ansible vpn_servers -i inventory/hosts.yml -m shell -a "ufw status"
ansible vpn_servers -i inventory/hosts.yml -m shell -a "systemctl status wg-quick@wg0"
```
---
## Summary
**Deployment complete when**:
1. ✅ All VPN endpoints deployed (VPN1/VPN2/VPN3)
2. ✅ Firewall restricts management to ValleyForge
3. ✅ User VPN ports open to internet
4. ✅ User configs retrieved
5. ✅ Services running (WireGuard, fail2ban, SSH)
**Your infrastructure**:
- **Secure**: Management only from ValleyForge
- **Scalable**: 200 users across 3 endpoints
- **Manageable**: Centralized Ansible control
- **Resilient**: Multiple endpoints for redundancy
**Next steps**:
- Distribute user configs
- Deploy collaboration server
- Set up monitoring
- Configure GitHub Actions (future)