So, upon coming home after my much-needed Thanksgiving vacation, I discovered that one of my requirements for this particular VPN was not met. I don’t think I mentioned it in the previous post, as I got a little lost in the sauce on that one. I’d prefer to have it just set to always-on.
If I’m always on remote networks, cool, that’s mostly fine. But when I get home, suddenly all the traffic trying to travel to my default gateway and to my DNS server get passed along to the VPN. Minor complaint, but I have to go in and switch the VPN off. Anything I’m working on gets interrupted. I’d rather have the same configuration whether or not I’m on data or wifi, and on my phone or laptop.
Unfortunately at this point I’ve nuked my old config so I can’t go back and figure out exactly what changed and why it changed, but oddly the configuration worked fine on my Windows work laptop. I set the AllowedIPs
to 10.0.20.0/24
and 10.0.0.0/24
, meaning all the traffic from my LAN passed through the VPN, and still seemed to work. On my phone it wasn’t working.
Another thing I wanted to fix was the potential conflict with other private networks - 10.0.0.0/24
is a pretty common one. Luckily, all the LANs I was on during my trip were 192.168.1.0/24
.
So two requirements, and I added a third just to differentiate the wg1
network more obviously.
- VPN always-on (it should work even if at home on LAN)
- Change my home network subnet to
10.0.99.0/24
- Change the
wg1
subnet to172.16.99.0/24
For the first requirement, I had an idea, which is what initially sent me down this rabbit hole. If I need to have direct access to the default gateway (.1
of the subnet) and my pi-hole DNS server (.2
of the subnet), why not put the devices I’d like to access into a different subnet? Not officially a different subnet on my router, but in my endpoint WireGuard configs only. Basically, just putting them in a different “subnet” and using the “subnet” to refer to them in the WireGuard AllowedIPs
.
Therefore, I came up with the following addressing scheme - something clicked about subnetting for me when I had this idea.
- Router (default gateway) at
10.0.99.1/24
- Pi-hole (DNS and WireGuard) at
10.0.99.2/24
- DHCP starting at
10.0.99.100/24
(default on OpenWRT, leaves first 100 addresses available) - All other static IPs that I would like to access via the VPN in the
10.0.99.32/27
subnet10.0.99.33 - 62
usable hosts, 30 addresses total. Room to grow.
This way, my AllowedIPs
line on the WireGuard config will only direct traffic pointed at the 172.16.99.0/24
subnet (wg1
) and the 10.0.99.32/27
subnet (any static IPs on my LAN that I want accessible via the VPN).
I got maybe a little too excited after coming up with this. I did have some level of foresight, and went into all my devices, set static IPs and default gateways in the correct 10.0.99.0/24
subnet before rebooting them and then going in and switching the subnet on my router - I didn’t want to have to console into each and every device.
Now, keep in mind, I had a working bounce server up until this point. I was able to access devices on my LAN remotely, by their LAN IP addresses, without them being directly connected to the VPN. I thought everything was hunky dory, and hadn’t dug into the actual routing all that much. It worked, and I was happy.
BUT. Something must have happened when I changed the subnet on the router. The VPN still worked. I could connect to the server, I could connect to the pi with both the local IP and the VPN IP. Depending on my AllowedIPs
line in my client config, I could connect to the LAN - but only if connected to the LAN, and not passing it through the VPN tunnel. I couldn’t connect to anything except devices connected directly to the VPN while using my hotspot to simulate an external network.
I verified and re-verified all my configs after I switched everything over. Nothing was broken, it should have all been working, as far as I knew. I tried different AllowedIPs
on the client, on the server, on the pi, and eventually looped back around to the original config: they should all be correct, routing at the VPN level has to be occurring as it should be, because it worked beforehand.
Eventually I dug around enough to figure out a troubleshooting method. I mapped out how a ping packet should move through the network, and I was going to figure out where it dropped.
Using tcpdump [-i interface] [-vvv] icmp
, and pinging my NAS (10.0.99.37
) from my VPS (172.16.99.1
), I figured out the following.
`172.16.99.1 > 10.0.99.37 ICMP echo request
- VPS → router port forwards to pi. The traffic doesn’t show up, but the pi receives it, so I know that’s happening.
- pi decrypts and sends along to
10.0.99.37
→ back to the router → to the NAS - NAS receives it, and composes a reply
10.0.99.37 > 172.16.99.1 ICMP echo reply
- NAS → router
- Router picks it up on
tcpdump
output. However, I know that I am not receiving any replies, and I don’t see anything but the echo request on my pi. So whatever is happening is happening between the router and the pi in my LAN. Not a problem with the VPN!
After taking a look at the route table in the router, I figured one of two things was happening (with basically no functional difference between the two cases):
- The router forwards the packet along its default route - to the WAN. WAN drops the packet as it is addressed to an RFC 1918 private address.
- The router sees an RFC 1918 private address, sees that it has no specific routes for it, and thus drops the packet. (Unsure if I would see it in
tcpdump
output if that were the case? I imagine I would, but I’m curious to know.)
And voila - I added a unicast route, target172.16.99.0/24
network, forward to the pi…and things started working!
Now I am left wondering: why the hell was this working in the first place? I did not set a specific route for the 10.0.20.0/24
network. Nothing that I know of changed aside from the subnets (and all related configurations accordingly), and they were still RFC 1918 addresses, albeit different ones. The only GUI I used to configure this was the web interface for my router - and that must be what did it. Otherwise I have absolutely no clue what could have made a difference.
A final note: red herring about midway through this process was the fact that my /etc/resolv.conf
in every single one of my static IP devices was still referring to my old DNS server address of 10.0.0.2
. I went to install tcpdump
and couldn’t resolve the package mirror hostnames…so I figured that must be the issue somehow. It wasn’t. I just forgot I hardcoded the DNS server. Fixed them, then still had the same problem.
For my own future reference, the only known updates to my configuration were the following:
- Subnets changed in my VPN to
172.16.99.0/24
and in my router to10.0.99.0/24
- Subnet in
AllowedIPs
for clients changed to10.0.99.32/27
I changed a LOT of other things in the process, but circled back around to unchanged configs aside from the ones specified above.
If nothing else, good learning experience! And I now have a fully functional remote and local VPN: always-on. My DNS even works, after changing one setting in pi-hole!
EOF