Skip to content
Infrastructure

BirdNET-Go, a doorbell cam, and a dynamic mic from the drawer

By Victor Da Luz
homelab birdnet-go proxmox frigate home-assistant ansible

I wanted bird detections. A small ML service listening at the window, catching whatever warblers and tanagers passed by, publishing what it heard to Home Assistant for dashboards. BirdNET-Pi seemed like the obvious choice: lots of tutorials, Raspberry Pi-native, big community. It was the plan I had landed on in an earlier research post.

Then I looked at the repo. Archived, August 2025. The original author handed it off and marked installation deprecated.

A fork (Nachtzuster/BirdNET-Pi) is alive: updates for Bookworm, x86 install support, real commits. But there is also tphakala/birdnet-go, a Go rewrite that ships Docker-first, has native Home Assistant MQTT auto-discovery (added in a January 2026 release), and exposes a Prometheus telemetry endpoint. For a homelab that already runs everything behind Traefik, scrapes Prometheus from a monitoring LXC, and speaks MQTT through Home Assistant’s mosquitto add-on, BirdNET-Go fits the shape of the rest of the infra.

Decision made. Now the fun part: doing it on a $0 hardware budget.

The junk drawer mic

Every BirdNET guide I read recommended a condenser like the Primo/Clippy EM272: omni, low self-noise, the nature-recording community’s favorite at around $30 for the capsule. I didn’t have one. What I did have:

  • A PreSonus AudioBox Go USB audio interface (50 dB max gain, phantom power button).
  • A Shure PGA48, a dynamic cardioid vocal mic that rolls off at 15 kHz with low sensitivity.

If you’re wincing, you’re right. The PGA48 is a karaoke mic. Its sensitivity is about 20 dB below a decent condenser. Point it at a bird 20 feet away and the peak amplitude in the captured audio lands around -46 dBFS, well into “barely audible” territory. You have to crank playback gain to hear anything.

Here is the thing I didn’t know until I deployed it: BirdNET’s TFLite model pattern-matches against frequency contours, not loudness. It correctly IDed a White-winged Dove from a clip so quiet I thought the file was corrupted. Detection works well below the threshold of casual listening.

So a cardioid karaoke mic, plus phantom power it doesn’t even use (the PGA48 is dynamic and ignores phantom entirely), got me a working pipeline. Not good audio, but enough for the model to do its job.

Deploy shape

Standard stuff for this homelab:

  • An LXC container on Proxmox, privileged with nesting enabled for Docker-in-LXC.
  • Docker Compose running ghcr.io/tphakala/birdnet-go:nightly.
  • A Traefik route at birdnet.example.net with Cloudflare DNS-01 TLS.
  • USB passthrough from host to container via pct set <vmid> -dev0 /dev/snd/controlC1,gid=29 -dev1 /dev/snd/pcmC1D0c,gid=29.
  • MQTT config pointing at the Home Assistant mosquitto add-on with a dedicated birdnet user.

The only surprise in the happy path: Home Assistant’s mosquitto add-on can’t be configured through the Supervisor API with a long-lived token anymore (that proxy endpoint was removed), and the SSH add-on refuses docker exec hassio_supervisor unless you turn off Protection Mode in the UI. For a one-time “add a user” job, clicking through the add-on’s Configuration tab is faster than chasing an automation path.

The doorbell fan-out

Here is the piece I liked. BirdNET-Go accepts RTSP streams as audio sources, and it can analyze several in parallel. I have a Reolink video doorbell feeding Frigate through go2rtc. Could I have BirdNET listen to the doorbell mic too?

The direct approach: point BirdNET at the camera’s RTSP URL. That needs a cross-VLAN firewall rule, adds a second RTSP client to a camera with limited tolerance for concurrent connections, and pulls the full video and audio stream over the wire just to throw the video away.

The better approach: fan out from Frigate’s go2rtc. The doorbell is already being consumed once, so add a derived stream that strips the video.

The go2rtc #audio=copy filter does exactly that. The docs are sparse, but the behavior is consistent with the project’s other ffmpeg filters: specify only audio=copy on a source and the video gets dropped. I added this to Frigate’s templated go2rtc config:

doorbell:
  - "ffmpeg:rtsp://admin:***@doorbell.security.internal:554//h264Preview_01_sub#...#audio=copy"
doorbell_audio:
  - "ffmpeg:doorbell#audio=copy"

The first line adds audio to the existing doorbell stream (Frigate wasn’t using audio from this camera, so nothing changes for video). The second creates a derived audio-only stream that references the first. BirdNET subscribes to rtsp://frigate.internal:8554/doorbell_audio.

Net effect: one TCP connection from Frigate to the camera (unchanged), one intra-VLAN connection from BirdNET to Frigate (about 32 kbps of AAC instead of the 500 kbps sub-stream), and no firewall rule to write. The doorbell sees a single client.

The gotcha that cost me the most time

Everything deployed. Detections rolled in. I clicked one in the web UI and got “Failed to load audio.” The clip file didn’t exist on disk.

What followed was two hours of wrong hypotheses. I had checked a single clip, seen it was a valid WAV, declared the pipeline fine, and moved on. That was the mistake: some clips were valid (just silent), others were missing entirely. I had to look harder.

The logs gave it up eventually. In actions.log:

14:30:49 "Audio clip saved successfully" clip_path="2026/04/zenaida_asiatica_78p_..."

Then a gap. Detections kept firing, but no more “saved successfully” lines. Around the same time, api.log started printing a warning on every container start:

WARN "Audio export path is empty, using default"

Root cause: my Ansible role merged MQTT and RTSP overrides into the persistent config.yaml with a slurp | b64decode | from_yaml | combine | to_nice_yaml | write round-trip. Somewhere in there (I still don’t know whether it is Ansible’s combine filter, to_nice_yaml, or BirdNET-Go normalizing its config on startup) the realtime.audio.export.path and .type fields were getting blanked back to empty strings. With those empty, BirdNET-Go silently stops writing clips. Nothing fatal, no obvious error, just no clips.

The fix was to stop trusting implicit preservation. The role now asserts audio.export.{enabled, type, path, length} on every run, so even if the round-trip zeroes them, the next merge puts them back:

birdnet_go_realtime_overrides:
  audio:
    export:
      enabled: true
      type: wav
      path: /data/clips
      length: 15

Lesson: when you do surgical YAML merges in Ansible, don’t assume the fields you didn’t mention survive. Re-assert the ones you need.

Results

Two audio sources, one service:

  • The USB mic capturing 48 kHz mono from the window.
  • The doorbell mic capturing 16 kHz mono AAC through Frigate’s go2rtc.

Detections firing on both. MQTT publishing to birdnet/detections on the Home Assistant broker. Clips saving to /data/clips on the LXC. A Prometheus telemetry endpoint ready on port 8090 (not scraped yet, that’s a follow-up). Traefik serving the UI at birdnet.example.net.

Total hardware spend: $0.

What I’d do differently

Skip the dynamic mic. The Shure PGA48 works as a proof of concept, but the signal floor is low enough that the clips are uncomfortable to listen to even with the gain cranked in the player. A $50 omni condenser like the EM272 would give me 15 to 20 dB more signal and make the recordings actually pleasant. The model detects birds from the audio it has; the human listening to the clip afterward deserves more signal than it’s getting.

Second, if you merge config files programmatically and the application treats empty strings as “use the default,” treat every round-trip as potentially lossy. Don’t find out a week later that clips stopped saving because a YAML filter quietly replaced "wav" with "".

If you want the plan this was supposed to follow, it’s in Researching BirdNET-Pi for backyard bird detection. Almost none of it carried over. That post has me buying a dedicated Raspberry Pi, a proper omni condenser mic, and running BirdNET-Pi. The real build kept the goal and tossed the rest: BirdNET-Go instead of BirdNET-Pi, existing Proxmox hardware instead of a new Pi, and a karaoke mic from a drawer instead of the condenser I promised myself.

Disclosure: As an Amazon Associate, I earn from qualifying purchases.

Related reading

Infrastructure

Migrating Home Assistant from Raspberry Pi to Proxmox

How I moved Home Assistant off a dedicated Raspberry Pi 4 and onto a Proxmox VM, covering Zigbee USB passthrough, backup restore, ZFS storage migration, and a DHCP lease time gotcha that caused WebSocket disconnects.

Read
Infrastructure

Consolidating audiobooks and ebooks into a single Audiobookshelf

I was running two media servers, Audiobookshelf for audiobooks and Kavita for ebooks, when one could do both. Rebuilding the homelab in v3 was the excuse to merge them: one Ansible-deployed Audiobookshelf, local-disk storage, and a USB-drive ZFS scare in the middle of the migration.

Read
Infrastructure

Diagnosing slow RomM scans on a large ROM library

RomM was taking 60-80 seconds per ROM during its first scan on my homelab. Here is what I found, what I changed, and why the real answer turned out to be a much bigger library than I thought.

Read

Ready to Transform Your Career?

Let's work together to unlock your potential and achieve your professional goals.