A Traefik 503 that was really a stale firewall address list
I went to check a camera and nvr.example.net gave me a 503 Service Unavailable. Not a 404, not a connection error in the browser, a clean 503 from Traefik, the reverse proxy that fronts every service in my homelab. The strange part: the Hikvision NVR behind that URL was fine. I could open its web UI directly on the LAN, it was recording, the device had no idea anything was wrong.
A 503 is a backend answer, not a routing answer
The shape of the error tells you which layer to suspect. A 404 from Traefik means no router matched the hostname at all, the proxy has never heard of the service. A 503 is different: a router matched, Traefik tried to forward the request to the backend, and the backend did not answer. So this was not “Traefik doesn’t know about the NVR.” It was “Traefik knows about the NVR and can’t reach it.”
That moves the question off the proxy config and onto the path between Traefik and the NVR.
The request reached Traefik and stopped there
From the Traefik host I hit the NVR directly, bypassing the proxy logic:
curl -sS -o /dev/null -w '%{http_code}\n' http://192.168.30.11
# (hangs, then times out)
A timeout, not a connection refused. Refused would mean the packets arrived and something said no. A silent timeout means the packets are being dropped on the way, and the usual culprit for silently dropped packets is a firewall.
Traefik and the NVR live on different VLANs. Traefik sits on the services VLAN (20), the NVR on the camera VLAN (30). Anything crossing between them goes through the MikroTik router, which is where my firewall rules live. So the drop was almost certainly there.
The rule was right, the list it trusted was wrong
The rule that lets Traefik reach the NVR existed and looked correct:
- comment: 'Allow Traefik nodes to access NVR'
action: accept
dst-address: '192.168.30.11'
src-address-list: Traefik_Nodes
It accepts traffic to the NVR from any address in the Traefik_Nodes address list. Nothing wrong with the rule. The list was the problem:
/ip firewall address-list print where list=Traefik_Nodes
# 192.168.20.13
# 192.168.20.14
Those two addresses are pi3 and pi4, the Raspberry Pis that used to run Traefik as two independent instances. Since then I had moved Traefik to an active/passive Keepalived setup across the same two Pis, with a virtual IP that floats to whichever node is active. Traffic now leaves Traefik from that VIP, 192.168.20.20, not from the individual node IPs.
So the firewall was matching a list that described the old topology. The active Traefik was sending packets from 192.168.20.20, which was not in Traefik_Nodes, so every packet to the NVR fell through to the final drop. From Traefik’s side it looked like a timeout. From the NVR’s side the connection never arrived.
I manage these lists from Ansible against a state file, the same pattern I use for IoT internet access. The migration changed how Traefik was addressed on the network, but I never updated the one list that another VLAN uses to recognize it.
The fix
Add the VIP to the list and redeploy:
Traefik_Nodes:
- 192.168.20.20 # Traefik VIP (Keepalived)
- 192.168.20.13 # pi3
- 192.168.20.14 # pi4
Running the Ansible firewall playbook pushed the updated list to the router. I restarted Traefik afterward to clear any backend it had already marked as down, though the firewall change is what actually fixed it. Then the same check that had been timing out came back clean:
curl -sS -o /dev/null -w '%{http_code}\n' https://nvr.example.net
# 200
Lessons learned
- A 503 from a reverse proxy is a backend-reachability problem, not a routing one. A 404 means no route; a 503 means the route is there and the backend won’t answer. The status code tells you which layer to open first.
- A firewall rule can be perfectly correct while the address list it points at is stale. The rule said “allow Traefik_Nodes,” which was exactly right. The list just no longer matched what Traefik had become.
- When you change how a service is addressed on the network, a VIP, a new host, a moved container, grep every firewall rule and address list that names the old address. The thing that breaks is rarely the config you touched. It is the one in another VLAN that quietly trusted the old IP.
- Cross-VLAN dependencies are easy to forget because they live on the router, not next to the service. The Traefik migration was a Traefik change. What it broke was a firewall list describing Traefik from the camera VLAN’s point of view, and nothing tied the two together.
The network runs a MikroTik RB4011 for routing and a MikroTik CRS326 for switching. The address-list technique works on any RouterOS setup.
Disclosure: This article contains affiliate links. If you purchase through these links, I may earn a commission at no extra cost to you.
Related reading
Troubleshooting RouterOS Local Gateway IP Unreachability
SSH to the router worked from other VLANs but not from the same VLAN as the gateway. What I ruled out, what still does not have a clean root cause, and the workaround that kept management sane.
Selective internet access for IoT devices with RouterOS address lists
VLAN 30 blocks all IoT traffic from the internet by default. This is how I punch selective holes for specific devices without rewriting the firewall per device.
Fixing the VLAN 30 IoT DNS Isolation Leak to Pi-hole
Pi-hole logs showed IoT devices on the isolated VLAN hitting internal DNS anyway. The cause was RouterOS DHCP plus firewall rule order, not a single mis-ticked box.
Ready to Transform Your Career?
Let's work together to unlock your potential and achieve your professional goals.