Diagnosing slow RomM scans on a large ROM library

After deploying RomM in my homelab, the next step was a full library scan. I kicked it off and watched the logs: 60 to 80 seconds per ROM, 2,625 ROMs discovered. That math works out to over 50 hours for one scan. Something had to be wrong.

Here is what I found, what I changed, and what I learned about scanning a library this size.

The initial problem

My setup: RomM running in Docker inside a Proxmox LXC, library on a NAS mounted over SMB at /media/emulation. ScreenScraper enabled as the metadata source.

The pattern in the logs was consistent:

File discovery: fast, 2,625 files found in about 13 seconds
Hash calculation: fast for small files
ROM processing: 60-80 seconds per ROM, sometimes longer

SCAN_WORKERS=4 was active. CPU sat around 77%, but I/O wait was low (1-2%). This was not a disk bottleneck.

Diagnosing the bottleneck

The obvious first suspect was ScreenScraper. Each ROM triggers multiple API calls: hash lookup, metadata fetch, cover art. With four workers hitting the API simultaneously, rate limits kick in. A missed match returns a 404 and waits on a timeout before moving on. Four ScreenScraper 404 errors showed up in the first few ROMs.

I disabled ScreenScraper in the UI and started a fresh scan. Rate dropped from 60-80 seconds to 24-30 seconds per ROM. That is a 2.5-3x improvement, but still slower than I expected for pure file I/O. The remaining time is the pipeline: read file over SMB, calculate its hash, check the hash against the database, write the result. CPU at 50%, I/O wait at 1-2%. Not bottlenecked on anything obvious - just the pipeline itself.

Reducing SCAN_WORKERS from 4 to 2 helped with ScreenScraper specifically. Four parallel workers hitting the API could all get rate-limited simultaneously. Two workers is a better balance.

The library size discovery

About an hour in, I noticed the progress numbers did not add up. The scan reported “2,625 roms found” early on, but hours later, processed ROM counts were still far below that. I counted the actual files on the NAS.

The library had 38,178 files across 178 platform folders.

RomM scans platforms one at a time. The “2,625 roms found” message was for a single platform - likely Amiga - not the total library. I had been looking at one platform’s progress message and assuming it was the total. The real library was 14.5x larger than I thought.

At 160 ROMs per hour, the full scan would take about 10 days.

That changed everything. The scan was not broken. The slow rate was not a bug. At 22 seconds per ROM without metadata APIs, scanning 38,000 ROMs just takes a long time.

What I changed

Scan workers and timeout

SCAN_WORKERS: '2'
SCAN_TIMEOUT: '28800'

SCAN_WORKERS=2 instead of 4. With ScreenScraper, fewer parallel workers avoid the rate limit problem. Without ScreenScraper, the pipeline is CPU and database-bound and two workers is still efficient.

SCAN_TIMEOUT=28800 sets an 8-hour window per scan job. The default timeout is short enough that a scan of a large platform can die before finishing. Eight hours gives any platform enough runway to complete.

SMB mount options

The original mount had minimal options. For a read-mostly workload like ROM scanning, these help:

vers=3.0,rsize=1048576,wsize=1048576,cache=loose,actimeo=60

vers=3.0 uses SMB 3 instead of older protocol versions. rsize and wsize set read/write buffer sizes to 1 MB. cache=loose lets the client cache file attributes aggressively. actimeo=60 extends the attribute cache timeout to 60 seconds, cutting round trips when RomM is doing directory enumeration.

Database password idempotency

I hit a MariaDB password mismatch on one re-deploy - the same class of problem I described in the deployment post. The fix in Ansible is to set the database container’s recreate policy:

recreate: never

Setting recreate: never on the database container prevents Docker from recreating it when the volume already exists. Without this, a re-deploy can wipe the initialized data directory and trigger a fresh init with whatever password is in the environment at that moment. The password drifts, MariaDB rejects the connection, and RomM cannot start.

Verification

After deploying the changes:

Scan rate: ~22.5 seconds per ROM without ScreenScraper, consistent across monitoring windows.
Two-minute monitoring snapshots showed 6 ROMs processed every 2 minutes, matching the expected rate.
CPU around 25% (23% user, 2% system), load around 1.8 with 2 workers active.
No ScreenScraper rate limit errors.
Progress climbed steadily: 176 to 329 to 445 ROMs over the course of the day.

The scan ran in the background and I let it continue.

Reflection

ScreenScraper is the dominant bottleneck for large libraries. If you can run the initial scan without metadata APIs and add enrichment later, the scan completes much faster.
Count your actual files before estimating scan time. RomM’s per-platform scan messages are not the total library count. find /media/emulation/roms -type f | wc -l gives you the real number.
A 10-day background scan for 38,000 ROMs is not broken. It is the expected result. The question is whether you need the whole library indexed at once or can scan platforms in priority order.
Set SCAN_TIMEOUT explicitly. The default is too short for large platform collections.
Two workers beats four for ScreenScraper-enabled scans. Four parallel API callers will all hit the rate limit at once.

Victor Da Luz

The initial problem

Diagnosing the bottleneck

The library size discovery

What I changed

Verification

Reflection

Related reading

Deploying RomM, a self-hosted ROM manager, in the homelab

Replacing Firefly III with Actual Budget

Deploying Homebox, a self-hosted home inventory, in the homelab

Ready to Transform Your Career?