Deluan Quintão 945d0ba1e2
fix(transcoding): cap concurrent transcodes to prevent ffmpeg DoS (#5522)
* feat(transcoding): add MaxConcurrent and MaxConcurrentPerUser config

Introduce Transcoding.MaxConcurrent (default NumCPU()*2) and
Transcoding.MaxConcurrentPerUser (default 3) to support upcoming
concurrency limits on the streaming pipeline. No behavior change yet.

Refs #5246

* feat(transcoding): add TranscodeLimiter with global and per-user caps

Introduce a non-blocking limiter that gates concurrent transcodes. Returns
ErrTooManyTranscodes immediately when the cap is reached so callers can
translate it into a 429 response, rather than queuing requests.

The per-user reservation is taken first to avoid burning a global slot
that would only be rolled back when the per-user cap rejects the caller.
Release is idempotent so wrapping the transcoder reader's Close is safe.

Refs #5246

* feat(transcoding): cap concurrent transcodes in media streamer

Acquire a TranscodeLimiter slot before spawning ffmpeg in the transcoding
cache's read function, and release it when the resulting reader is closed.
Raw streams and cache hits bypass the limiter so a single saturating client
cannot block ordinary playback.

When the cap is reached, ErrTooManyTranscodes bubbles up through cache.Get,
ready for the HTTP layer to translate into a 429 response.

Refs #5246

* feat(transcoding): return HTTP 429 with Retry-After when transcode cap is hit

Map stream.ErrTooManyTranscodes to HTTP 429 in both the Subsonic API
(/stream, /download) and the public share endpoint, including a 5s
Retry-After hint. The Subsonic response still carries a failed-status
envelope so clients that ignore HTTP codes also see the failure.

Refs #5246

* feat(transcoding): default MaxConcurrent to 0 (disabled)

Ship the limiter opt-in so existing installations are not affected by a
behavior change on upgrade. Users hitting the DoS reported in #5246 can
enable it by setting Transcoding.MaxConcurrent to a positive value
(NumCPU()*2 is a reasonable starting point).

Refs #5246

* fix(transcoding): make global and per-user caps independent

Previously the limiter short-circuited to a no-op whenever MaxConcurrent
was zero, silently ignoring a configured MaxConcurrentPerUser. Treat each
cap independently so an operator can throttle per-user without enforcing
a global ceiling (or vice versa), and only fall back to the no-op limiter
when both caps are disabled.

* fix(archiver): abort archive download when the transcode limiter rejects

The album/artist/playlist zip writers were silently producing zip entries
with headers but no data when ms.NewStream returned ErrTooManyTranscodes,
because the per-file error was discarded by `_ = a.addFileToZip(...)`.
The client received HTTP 200 with a corrupt zip and no indication that
the server was rate-limited.

Now the zip loop bails out as soon as it sees ErrTooManyTranscodes, and
the Download handler swallows the error (the response status and
Content-Disposition are already flushed by the time the limit is hit, so
no 429 can be sent). The truncated zip surfaces the problem to the
client; operators see a clear "transcode cap reached" warning in the
server logs.

Refs #5246

* fix(transcoding): release limiter slot on client close, not ffmpeg EOF

Previously the slot was wrapped around the ffmpeg source reader, so it
was only released by the cache's background copyAndClose goroutine when
ffmpeg finished producing the file — meaning a client that disconnected
after a single byte still held the slot for the full transcode duration.
Under MaxConcurrent=N this serialized fresh requests behind abandoned
encodes for minutes.

Hand the release function back from the cache producer via the streamJob
struct and wire it into the consumer-side Stream.Close. The HTTP handler
already runs `defer stream.Close()`, so disconnect now frees the slot
immediately. Cache hits never enter the producer and still pay no slot,
and singleflight waiters on the same key correctly inherit no release
(only the original producer's job holds the slot).

Refs #5246

* fix(transcoding): skip per-user cap for anonymous requests

Public share viewers have no user in context, so userName(ctx) returned
the literal string "UNKNOWN" and the limiter mapped every anonymous
viewer to the same bucket. With MaxConcurrentPerUser=N, only N
unrelated anonymous clients could stream a viral share at any time —
the opposite of the fairness the per-user cap is meant to provide.

Introduce a limiterKey(ctx) helper that returns "" for anonymous
callers (userName(ctx) is unchanged for logs), and teach Acquire to
skip the per-user reservation when the key is empty. The global cap is
still enforced for anonymous traffic and remains the protection against
runaway anonymous load.

Refs #5246

* refactor(transcoding): tidy limiter struct and centralize Retry-After

Per review feedback:

- Drop the redundant maxConcurrent field on transcodeLimiter; the channel
  capacity already enforces the global cap and the field was only used
  inside the constructor.
- Only allocate the perUser map when MaxConcurrentPerUser > 0.
- Move the Retry-After value into core/stream as RetryAfterSeconds so the
  Subsonic API and public-share handlers cannot drift if the window is
  later tuned.

* fix(transcoding): do not log limiter rejections as cache failures

NewStream was emitting an error-level "Error accessing transcoding cache"
log whenever cache.Get returned anything non-nil, including the limiter's
ErrTooManyTranscodes — even though the producer had already logged the
rejection at warn level. The result was double logging and a misleading
"cache failure" classification that buries real cache problems.

Skip the error log when the cause is ErrTooManyTranscodes; the warn line
from the producer is the canonical signal.

* fix(archiver): open stream before writing zip entry header

Per review: addFileToZip previously called z.CreateHeader before
NewStream, so when the limiter rejected a transcode the zip already
contained a 0-byte entry for that track. Open the source first and only
write the header once the read side is ready; rejections now skip the
entry entirely.

The truncation comment in handleArchiveErr was also misleading — z.Close
finalises the central directory, so the client receives a well-formed
zip containing only the tracks written before the rejection, not a
"truncated" archive. Reword to match reality.

* fix(transcoding): hold slot for ffmpeg lifetime, force cancellable ctx

The previous release-on-consumer-close design let a client open many
unique transcodes, disconnect immediately, and still spawn the
configured cap's worth of ffmpeg processes — the cache writer goroutine
continued draining ffmpeg to disk after the client disappeared, defeating
the DoS protection the limiter is meant to provide.

Move the release back onto the source reader so the slot is freed only
when ffmpeg actually exits (either EOF or context cancellation). To keep
disconnects from leaking slots for the full transcode duration, force
the request context into ffmpeg whenever the limiter is enabled — so
client disconnect cancels the process and frees the slot promptly.

When the limiter is disabled, the legacy EnableTranscodingCancellation
behavior is preserved unchanged.

Reported by codex and Copilot reviewers on #5522.
2026-05-24 00:24:30 -03:00
2026-05-20 17:43:12 -03:00
2026-05-20 17:43:12 -03:00
2026-05-20 17:43:12 -03:00

Navidrome logo

Navidrome Music Server  Tweet

Last Release Build Downloads Docker Pulls Dev Chat Subreddit Contributor Covenant Gurubase

Navidrome is an open source web-based music collection server and streamer. It gives you freedom to listen to your music collection from any browser or mobile device. It's like your personal Spotify!

Note: The master branch may be in an unstable or even broken state during development. Please use releases instead of the master branch in order to get a stable set of binaries.

Check out our Live Demo!

Any feedback is welcome! If you need/want a new feature, find a bug or think of any way to improve Navidrome, please file a GitHub issue or join the discussion in our Subreddit. If you want to contribute to the project in any other way (ui/backend dev, translations, themes), please join the chat in our Discord server.

Installation

See instructions on the project's website

Cloud Hosting

PikaPods has partnered with us to offer you an officially supported, cloud-hosted solution. A share of the revenue helps fund the development of Navidrome at no additional cost for you.

PikaPods

Features

  • Handles very large music collections
  • Streams virtually any audio format available
  • Reads and uses all your beautifully curated metadata
  • Great support for compilations (Various Artists albums) and box sets (multi-disc albums)
  • Multi-user, each user has their own play counts, playlists, favourites, etc...
  • Very low resource usage
  • Multi-platform, runs on macOS, Linux and Windows. Docker images are also provided
  • Ready to use binaries for all major platforms, including Raspberry Pi
  • Automatically monitors your library for changes, importing new files and reloading new metadata
  • Themeable, modern and responsive Web interface based on Material UI
  • Compatible with all Subsonic/Madsonic/Airsonic clients
  • Transcoding on the fly. Can be set per user/player. Opus encoding is supported
  • Translated to various languages

Translations

Navidrome uses POEditor for translations, and we are always looking for more contributors

Documentation

All documentation can be found in the project's website: https://www.navidrome.org/docs. Here are some useful direct links:

Screenshots

Languages
Go 77.2%
JavaScript 18%
Rust 2.9%
Python 1.2%
Makefile 0.3%
Other 0.3%