A stopped iperf3 container and the Proxmox autostart flag I forgot to set
I run iperf3 in a dedicated Proxmox LXC container for automated LAN performance testing. The container runs an iperf3 server plus a node_exporter instance that Prometheus scrapes as part of my network monitoring stack.
One morning I got two alerts in quick succession:
- Service
iperf3down on the network testing host (down for more than 2 minutes) - Prometheus target for that host unreachable (down for more than 5 minutes)
No maintenance window, nothing scheduled. Something was actually down.
The investigation
The first place I looked was the container itself. From the Proxmox host:
pct list
The container was stopped. Not crashed - just stopped. There were no logs suggesting a panic or OOM kill. It was simply not running.
I checked the network config and service files inside - both iperf3-server and node_exporter were configured correctly. Everything looked fine except the container wasn’t up.
Then I pulled the container config:
pct config 108
There it was: onboot: 0. The container wasn’t set to start automatically when the Proxmox host boots. Somewhere along the way - probably a host reboot after a kernel update or a brief power blip - the host had come back up and the container had stayed down. Since there was no autostart, nothing brought it back.
The fix
Starting the container:
pct start 108
Both services came back up immediately. iperf3-server on port 5201, node_exporter on port 9100. Prometheus targets moved to “up” on the next scrape cycle.
Then the permanent fix:
pct set 108 --onboot 1
That’s it. The container is now configured to restart automatically when the host boots.
What I verified
- Container status: running
- iperf3-server: listening on port 5201
- node_exporter: listening on port 9100
- Prometheus targets: both “up”
- Container config:
onboot: 1confirmed
Reflection
Prometheus did exactly what it’s supposed to do - it caught this before I noticed it manually. The monitoring worked. What didn’t work was my own deployment checklist.
Proxmox defaults new LXC containers to onboot: 0. That’s probably the right default for lab containers you spin up temporarily, but it means any service container you want to survive reboots needs that flag explicitly set. Nothing in the provisioning workflow forces you to think about it.
A few things I take away from this:
- Set
onboot: 1when you provision a service container. Not after the first outage. The Proxmox UI has the option; the CLI is one flag. It takes ten seconds and doesn’t need to be a todo item you come back to. - Prometheus caught a failure I might not have noticed for hours. The container had been stopped since the last host reboot, which could have been days earlier. Without monitoring I’d have no way to know.
- Quick to fix, slow to notice without monitoring. The actual fix was under two minutes. The lesson is that monitoring is what closed the gap between “container stopped” and “someone knows about it.”
Related reading
Network monitoring evolution: Home Assistant metrics, alert tuning, and LAN latency
After moving WAN speed tests off a dedicated exporter, I normalized metrics for Grafana, cut alert noise, and fixed a blind spot: iperf3 throughput without latency, plus tuning alerts around scrape targets that are often flaky in practice.
Troubleshooting QNAP SNMP monitoring timeouts
Why Prometheus kept timing out on SNMP scrapes against a QuTS Hero NAS, how I tuned exporter and scrape timeouts, and the NFS and SMB follow-ups that showed up afterward.
Setting Up Network Performance Testing Infrastructure
How I added automated WAN and LAN performance monitoring to my homelab using Prometheus and Grafana, and what I learned from tracking network performance over time.
Ready to Transform Your Career?
Let's work together to unlock your potential and achieve your professional goals.