mirror of
https://github.com/navidrome/navidrome.git
synced 2026-06-02 07:01:36 +00:00
13 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
833c50adc7 |
test(stream): fix data race in MediaStreamer transcoding cap tests
The three It blocks that build a tight-cap streamer each spawned a fresh transcoding cache without waiting for its background initialization. The init goroutine reads conf.Server.CacheFolder, which races against SnapshotConfig's pointer-swap restore (Server = &restored) fired by DeferCleanup at the end of the spec. CI tripped the race under -shuffle=on -race; locally it reproduced about 10% of the time. Wait for tightCache.Available() before constructing the streamer, mirroring the outer BeforeEach. For the slot-saturation spec, swap in a blocking io.Pipe-backed mock ffmpeg so the cache's background copyAndClose can't drain the source and release the slot — the previous behavior happened to work only because the cache wasn't yet available and the no-cache path was exercised. |
||
|
|
823d851b75
|
refactor(transcoding): rename EnableTranscodingCancellation to Transcoding.EnableCancellation (#5523)
Move the option into the nested Transcoding config group alongside the limit knobs it interacts with, so all transcoding-related settings live together. The old top-level name is still honored via the existing mapDeprecatedOption / logDeprecatedOptions plumbing, which forwards the value to the new key and logs a deprecation warning at startup. The old struct field is removed (the new field is the single source of truth); the deprecated default is removed so viper.IsSet correctly distinguishes "user set the legacy option" from "no one set it." |
||
|
|
945d0ba1e2
|
fix(transcoding): cap concurrent transcodes to prevent ffmpeg DoS (#5522)
* feat(transcoding): add MaxConcurrent and MaxConcurrentPerUser config Introduce Transcoding.MaxConcurrent (default NumCPU()*2) and Transcoding.MaxConcurrentPerUser (default 3) to support upcoming concurrency limits on the streaming pipeline. No behavior change yet. Refs #5246 * feat(transcoding): add TranscodeLimiter with global and per-user caps Introduce a non-blocking limiter that gates concurrent transcodes. Returns ErrTooManyTranscodes immediately when the cap is reached so callers can translate it into a 429 response, rather than queuing requests. The per-user reservation is taken first to avoid burning a global slot that would only be rolled back when the per-user cap rejects the caller. Release is idempotent so wrapping the transcoder reader's Close is safe. Refs #5246 * feat(transcoding): cap concurrent transcodes in media streamer Acquire a TranscodeLimiter slot before spawning ffmpeg in the transcoding cache's read function, and release it when the resulting reader is closed. Raw streams and cache hits bypass the limiter so a single saturating client cannot block ordinary playback. When the cap is reached, ErrTooManyTranscodes bubbles up through cache.Get, ready for the HTTP layer to translate into a 429 response. Refs #5246 * feat(transcoding): return HTTP 429 with Retry-After when transcode cap is hit Map stream.ErrTooManyTranscodes to HTTP 429 in both the Subsonic API (/stream, /download) and the public share endpoint, including a 5s Retry-After hint. The Subsonic response still carries a failed-status envelope so clients that ignore HTTP codes also see the failure. Refs #5246 * feat(transcoding): default MaxConcurrent to 0 (disabled) Ship the limiter opt-in so existing installations are not affected by a behavior change on upgrade. Users hitting the DoS reported in #5246 can enable it by setting Transcoding.MaxConcurrent to a positive value (NumCPU()*2 is a reasonable starting point). Refs #5246 * fix(transcoding): make global and per-user caps independent Previously the limiter short-circuited to a no-op whenever MaxConcurrent was zero, silently ignoring a configured MaxConcurrentPerUser. Treat each cap independently so an operator can throttle per-user without enforcing a global ceiling (or vice versa), and only fall back to the no-op limiter when both caps are disabled. * fix(archiver): abort archive download when the transcode limiter rejects The album/artist/playlist zip writers were silently producing zip entries with headers but no data when ms.NewStream returned ErrTooManyTranscodes, because the per-file error was discarded by `_ = a.addFileToZip(...)`. The client received HTTP 200 with a corrupt zip and no indication that the server was rate-limited. Now the zip loop bails out as soon as it sees ErrTooManyTranscodes, and the Download handler swallows the error (the response status and Content-Disposition are already flushed by the time the limit is hit, so no 429 can be sent). The truncated zip surfaces the problem to the client; operators see a clear "transcode cap reached" warning in the server logs. Refs #5246 * fix(transcoding): release limiter slot on client close, not ffmpeg EOF Previously the slot was wrapped around the ffmpeg source reader, so it was only released by the cache's background copyAndClose goroutine when ffmpeg finished producing the file — meaning a client that disconnected after a single byte still held the slot for the full transcode duration. Under MaxConcurrent=N this serialized fresh requests behind abandoned encodes for minutes. Hand the release function back from the cache producer via the streamJob struct and wire it into the consumer-side Stream.Close. The HTTP handler already runs `defer stream.Close()`, so disconnect now frees the slot immediately. Cache hits never enter the producer and still pay no slot, and singleflight waiters on the same key correctly inherit no release (only the original producer's job holds the slot). Refs #5246 * fix(transcoding): skip per-user cap for anonymous requests Public share viewers have no user in context, so userName(ctx) returned the literal string "UNKNOWN" and the limiter mapped every anonymous viewer to the same bucket. With MaxConcurrentPerUser=N, only N unrelated anonymous clients could stream a viral share at any time — the opposite of the fairness the per-user cap is meant to provide. Introduce a limiterKey(ctx) helper that returns "" for anonymous callers (userName(ctx) is unchanged for logs), and teach Acquire to skip the per-user reservation when the key is empty. The global cap is still enforced for anonymous traffic and remains the protection against runaway anonymous load. Refs #5246 * refactor(transcoding): tidy limiter struct and centralize Retry-After Per review feedback: - Drop the redundant maxConcurrent field on transcodeLimiter; the channel capacity already enforces the global cap and the field was only used inside the constructor. - Only allocate the perUser map when MaxConcurrentPerUser > 0. - Move the Retry-After value into core/stream as RetryAfterSeconds so the Subsonic API and public-share handlers cannot drift if the window is later tuned. * fix(transcoding): do not log limiter rejections as cache failures NewStream was emitting an error-level "Error accessing transcoding cache" log whenever cache.Get returned anything non-nil, including the limiter's ErrTooManyTranscodes — even though the producer had already logged the rejection at warn level. The result was double logging and a misleading "cache failure" classification that buries real cache problems. Skip the error log when the cause is ErrTooManyTranscodes; the warn line from the producer is the canonical signal. * fix(archiver): open stream before writing zip entry header Per review: addFileToZip previously called z.CreateHeader before NewStream, so when the limiter rejected a transcode the zip already contained a 0-byte entry for that track. Open the source first and only write the header once the read side is ready; rejections now skip the entry entirely. The truncation comment in handleArchiveErr was also misleading — z.Close finalises the central directory, so the client receives a well-formed zip containing only the tracks written before the rejection, not a "truncated" archive. Reword to match reality. * fix(transcoding): hold slot for ffmpeg lifetime, force cancellable ctx The previous release-on-consumer-close design let a client open many unique transcodes, disconnect immediately, and still spawn the configured cap's worth of ffmpeg processes — the cache writer goroutine continued draining ffmpeg to disk after the client disappeared, defeating the DoS protection the limiter is meant to provide. Move the release back onto the source reader so the slot is freed only when ffmpeg actually exits (either EOF or context cancellation). To keep disconnects from leaking slots for the full transcode duration, force the request context into ffmpeg whenever the limiter is enabled — so client disconnect cancels the process and frees the slot promptly. When the limiter is disabled, the legacy EnableTranscodingCancellation behavior is preserved unchanged. Reported by codex and Copilot reviewers on #5522. |
||
|
|
8f0b4930ff
|
refactor(conf): replace eager dir creation with lazy Dir type (#5495)
* feat(conf): add Dir type with lazy directory creation Introduces the Dir type that wraps a directory path string and defers os.MkdirAll until the first call to Path() or MustPath(), using sync.Once to ensure the creation happens exactly once. Implements fmt.Stringer, encoding.TextMarshaler, and encoding.TextUnmarshaler for config integration. Includes Ginkgo/Gomega tests covering all methods and error paths. * refactor(conf): replace eager dir creation with lazy Dir type Change DataFolder, CacheFolder, Plugins.Folder, and Backup.Path from string to Dir. Remove all os.MkdirAll calls from Load() so directories are created lazily on first Path()/MustPath() call. Artwork folder creation was already handled at point-of-use in image_upload.go. Add SnapshotConfig() to conf package for safe test config save/restore that avoids copying sync.Once inside Dir fields. Fix copy-lock vet warning in nativeapi/config.go by marshalling pointer instead of value. * refactor(conf): migrate tests and db init to lazy Dir type Update all test files to use conf.NewDir() for Dir field assignments. Ensure DataFolder is created lazily when the database is first opened in db.Db(). Remove eager directory creation from conf.Load() tests. * fix(conf): address review findings for Dir type - Use os.ModePerm for DataFolder/CacheFolder (was 0700, should match original behavior). Add NewDirWithPerm for PluginsFolder (0700). - Use Path() instead of MustPath() in db.Prune() to avoid logFatal from background cron job. - Panic on marshal/unmarshal errors in SnapshotConfig (test helper). - Clean up redundant String()/MustPath() calls in plugin manager. - Remove dead code in dir_test.go. Signed-off-by: Deluan <deluan@navidrome.org> * fix(conf): add GoString to Dir for clean config dump output Implement fmt.GoStringer on Dir so pretty.Sprintf shows the path string instead of internal struct fields (sync.Once, perm, err). Also add TODO comment to configtest about removing the indirection. * fix(dir): improve error logging in MustPath method Signed-off-by: Deluan <deluan@navidrome.org> * refactor(tests): remove redundant tests for unwritable DataFolder and CacheFolder Signed-off-by: Deluan <deluan@navidrome.org> * fix(conf): address PR review feedback - Ensure Plugins.Folder always uses 0700, even when user-configured (previously only the derived default got restrictive permissions). - Create LogFile parent directory before opening, so LogFile paths inside a not-yet-created DataFolder work correctly. --------- Signed-off-by: Deluan <deluan@navidrome.org> |
||
|
|
b18dfb474a
|
fix(transcoding): don't apply server-side override on getTranscodeDecision (#5473)
* fix(transcoding): don't apply server-side transcoding override on getTranscodeDecision The getTranscodeDecision endpoint was incorrectly applying server-side player transcoding overrides (forced format and MaxBitRate cap), which replaced the client's declared capabilities with synthetic profiles. This caused the endpoint to ignore what the client can actually play and return decisions for formats the client never requested (e.g. AAC when the client only supports FLAC/opus/mp3). The override is now gated behind an ApplyServerOverride flag in TranscodeOptions, which is only set by the legacy stream endpoint where this behavior is expected. Signed-off-by: Deluan <deluan@navidrome.org> * refactor: move server-side transcoding override to ResolveRequest Moved the server-side player transcoding override logic (forced format and MaxBitRate cap) from MakeDecision into ResolveRequest, where the legacy stream context is handled. This makes MakeDecision a pure function that only operates on the ClientInfo it receives, removing the ApplyServerOverride flag and all context-sniffing from the decision engine. Tests moved accordingly to legacy_client_test.go. * test(e2e): update transcode decision tests for server override removal Updated e2e tests to reflect that getTranscodeDecision no longer applies server-side player overrides (MaxBitRate cap and forced transcoding profile). The player MaxBitRate tests now verify the endpoint ignores the player cap and relies solely on client-declared capabilities. * test(e2e): assert opus default bitrate when player cap is ignored Added bitrate assertion to verify the player MaxBitRate cap is truly ignored: the target bitrate should be the opus format default (128kbps), not the player cap (320kbps). --------- Signed-off-by: Deluan <deluan@navidrome.org> |
||
|
|
27209ed26a
|
fix(transcoding): clamp target channels to codec limit (#5336) (#5345)
* fix(transcoding): clamp target channels to codec limit (#5336) When transcoding a multi-channel source (e.g. 6-channel FLAC) to MP3, the decider passed the source channel count through to ffmpeg unchanged. The default MP3 command path then emitted `-ac 6`, and the template path injected `-ac 6` after the template's own `-ac 2`, causing ffmpeg to honor the last occurrence and fail with exit code 234 since libmp3lame only supports up to 2 channels. Introduce `codecMaxChannels()` in core/stream/codec.go (mp3→2, opus→8), mirroring the existing `codecMaxSampleRate` pattern, and apply the clamp in `computeTranscodedStream` right after the sample-rate clamps. Also fix a pre-existing ordering bug where the profile's MaxAudioChannels check compared against src.Channels rather than ts.Channels, which would have let a looser profile setting raise the codec-clamped value back up. Comparing against the already-clamped ts.Channels makes profile limits strictly narrowing, which matches how the sample-rate block already behaves. The ffmpeg buildTemplateArgs comment is refreshed to point at the new upstream clamp, since the flags it injects are now always codec-safe. Adds unit tests for codecMaxChannels and four decider scenarios covering the literal issue repro (6-ch FLAC→MP3 clamps to 2), a stricter profile limit winning over the codec clamp, a looser profile limit leaving the codec clamp intact, and a codec with no hard limit (AAC) passing 6 channels through. * test(e2e): pin codec channel clamp at the Subsonic API surface (#5336) Add a 6-channel FLAC fixture to the e2e test suite and use it to assert the codec channel clamp end-to-end on both Subsonic streaming endpoints: - getTranscodeDecision (mp3OnlyClient, no MaxAudioChannels in profile): expects TranscodeStream.AudioChannels == 2 for the 6-channel source. This exercises the new codecMaxChannels() helper through the OpenSubsonic decision endpoint, with no profile-level channel limit masking the bug. - /rest/stream (legacy): requests format=mp3 against the multichannel fixture and asserts streamerSpy.LastRequest.Channels == 2, confirming the clamp propagates through ResolveRequest into the stream.Request that the streamer receives. The fixture is metadata-only (channels: 6 plumbed via the existing storagetest.File helper) — no real audio bytes required, since the e2e suite uses a spy streamer rather than invoking ffmpeg. Bumps the empty-query search3 song count expectation from 13 to 14 to account for the new fixture. * test(decider): clarify codec-clamp comment terminology Distinguish "transcoding profile MaxAudioChannels" (Profile.MaxAudioChannels field) from "LimitationAudioChannels" (CodecProfile rule constant). The regression test bypasses the former, not the latter. |
||
|
|
36a7be9eaf
|
fix(transcoding): include ffprobe in MSI and fall back gracefully when absent (#5326)
* fix(msi): include ffprobe executable in MSI build Signed-off-by: Deluan <deluan@navidrome.org> * feat(ffmpeg): add IsProbeAvailable() to FFmpeg interface Add runtime check for ffprobe binary availability with cached result and startup logging. When ffprobe is missing, logs a warning at startup. * feat(stream): guard MakeDecision behind ffprobe availability When ffprobe is not available, MakeDecision returns a decision with ErrorReason set and both CanDirectPlay and CanTranscode false, instead of failing with an opaque exec error. * feat(subsonic): only advertise transcoding extension when ffprobe is available The OpenSubsonic transcoding extension is now conditionally included based on ffprobe availability, so clients know not to call getTranscodeDecision when ffprobe is missing. * refactor(ffmpeg): move ffprobe startup warning to initial_setup Move the ffprobe availability warning from the lazy IsProbeAvailable() check to checkFFmpegInstallation() in server/initial_setup.go, alongside the existing ffmpeg warning. This ensures the warning appears at startup rather than on first endpoint call. * fix(e2e): set noopFFmpeg.IsProbeAvailable to true The e2e tests use pre-populated probe data and don't need a real ffprobe binary. Setting IsProbeAvailable to true allows the transcode decision logic to proceed normally in e2e tests. * fix(stream): only guard on ffprobe when probing is needed Move the IsProbeAvailable() guard inside the SkipProbe check so that legacy stream requests (which pass SkipProbe: true) are not blocked when ffprobe is missing. The guard only applies when probing is actually required (i.e., getTranscodeDecision endpoint). * refactor(stream): fall back to tag metadata when ffprobe is unavailable Instead of blocking getTranscodeDecision when ffprobe is missing, fall back to tag-based metadata (same behavior as /rest/stream). The transcoding extension is always advertised. A startup warning still alerts admins when ffprobe is not found. * fix(stream): downgrade ffprobe-unavailable log to Debug Avoids log spam when clients call getTranscodeDecision repeatedly without ffprobe installed. The startup warning in initial_setup.go already alerts admins at Warn level. --------- Signed-off-by: Deluan <deluan@navidrome.org> |
||
|
|
664217f3f7 |
fix(transcoding): play WAV files directly in browsers instead of transcoding (#5309)
* fix: allow WAV direct play by aliasing pcm and wav codecs
WAV files were being transcoded to FLAC even when the browser declared
native WAV support. The backend normalizes ffprobe's pcm_s16le (and
similar PCM variants) to the internal codec name "pcm", while browsers
advertise WAV support as audioCodecs:["wav"] in their client profile.
The direct-play codec check compared these literally and rejected the
match with "audio codec not supported", forcing a needless FLAC
transcode.
Added {"pcm", "wav"} to codecAliasGroups so the matcher treats them
as equivalent. The container check runs first, so AIFF files (which also
normalize to codec "pcm" but use container "aiff") cannot
accidentally match a WAV direct-play profile.
* feat: include profile details in direct-play rejection reasons
The transcodeReason array returned by getTranscodeDecision previously
contained one generic string per failed DirectPlayProfile (e.g., five
copies of "container not supported"), making it hard to correlate a
reason with the profile that rejected the stream.
Each rejection reason now embeds the offending source value (in single
quotes) along with a compact representation of the full profile that
rejected it, rendered as [container/codec]. For example, clients with
two distinct ogg-container profiles (opus and vorbis) produced two
identical rejection strings; they now read "container 'wav' not
supported by profile [ogg/opus]" and "container 'wav' not supported
by profile [ogg/vorbis]", making each entry in the transcodeReason
array unique and self-describing.
A small describeProfile helper renders profiles as [container/codec]
(or [container] when no codec is constrained).
* refactor(stream): address code review — narrow pcm/wav match, tighten tests
Responds to reviewer feedback on the initial PR:
- Replace the symmetric pcm↔wav codec alias with a contextual
isPCMInWAVMatch check in checkDirectPlayProfile. The alias
unconditionally equated the two names in matchesCodec, which would
let AIFF sources (also normalized to codec "pcm") falsely satisfy
a codec-only ["wav"] direct-play profile that omitted containers.
The new check additionally requires src.Container == "wav" before
bridging the names, closing the false-positive path.
- Tighten the rejection-reason test assertions to verify the new
formatted output (source value + profile descriptor) instead of
just matching loose substrings like "container", preventing
unrelated rejections from satisfying the expectations.
- Add coverage for the WAV→wav-codec acceptance path and for the
AIFF-in-wav-codec-profile rejection path to pin down the contract
of isPCMInWAVMatch.
* refactor(codec): rename isPCMInWAVMatch to matchesPCMWAVBridge for clarity
Signed-off-by: Deluan <deluan@navidrome.org>
---------
Signed-off-by: Deluan <deluan@navidrome.org>
|
||
|
|
3f7226d253
|
fix(server): improve transcoding failure diagnostics and error responses (#5227)
* fix(server): capture ffmpeg stderr and warn on empty transcoded output When ffmpeg fails during transcoding (e.g., missing codec like libopus), the error was silently discarded because stderr was sent to io.Discard and the HTTP response returned 200 OK with a 0-byte body. - Capture ffmpeg stderr in a bounded buffer (4KB) and include it in the error message when the process exits with a non-zero status code - Log a warning when transcoded output is 0 bytes, guiding users to check codec support and enable Trace logging for details - Remove log level guard so transcoding errors are always logged, not just at Debug level Signed-off-by: Deluan <deluan@navidrome.org> * fix(server): return proper error responses for empty transcoded output Instead of returning HTTP 200 with 0-byte body when transcoding fails, return a Subsonic error response (for stream/download/getTranscodeStream) or HTTP 500 (for public shared streams). This gives clients a clear signal that the request failed rather than a misleading empty success. Signed-off-by: Deluan <deluan@navidrome.org> * test(e2e): add tests for empty transcoded stream error responses Add E2E tests verifying that stream and download endpoints return Subsonic error responses when transcoding produces empty output. Extend spyStreamer with SimulateEmptyStream and SimulateError fields to support failure injection in tests. Signed-off-by: Deluan <deluan@navidrome.org> * refactor(server): extract stream serving logic into Stream.Serve method Extract the duplicated non-seekable stream serving logic (header setup, estimateContentLength, HEAD draining, io.Copy with error/empty detection) from server/subsonic/stream.go and server/public/handle_streams.go into a single Stream.Serve method on core/stream. Both callers now delegate to it, eliminating ~30 lines of near-identical code. * fix(server): return 200 with empty body for stream/download on empty transcoded output Don't return a Subsonic error response when transcoding produces empty output on stream/download endpoints — just log the error and return 200 with an empty body. The getTranscodeStream and public share endpoints still return HTTP 500 for empty output. Stream.Serve now returns (int64, error) so callers can check the byte count. --------- Signed-off-by: Deluan <deluan@navidrome.org> |
||
|
|
5ecbe31a06 |
fix: implement fallback to DefaultDownsamplingFormat for unknown formats
Signed-off-by: Deluan <deluan@navidrome.org> |
||
|
|
d8bc41fbb1
|
fix: use ADTS for AAC transcoding, temporarily exclude AAC from transcode decisions (#5167)
* fix: use ADTS format for AAC transcoding to avoid silent output on ffmpeg 8.0+ The fragmented MP4 muxer (`-f ipod -movflags frag_keyframe+empty_moov`) produces corrupt/silent audio when ffmpeg pipes to stdout, confirmed on ffmpeg 8.0+. The moof atom offset values are zeroed out in pipe mode, causing AAC decoder errors. Switch to `-f adts` (raw AAC framing) which works reliably via pipe and is widely supported by clients including UPnP/Sonos devices. * fix: exclude AAC from transcode decision, as it is not working for Sonos. Signed-off-by: Deluan <deluan@navidrome.org> --------- Signed-off-by: Deluan <deluan@navidrome.org> |
||
|
|
053a0fd6c0 |
fix: prevent raw file being returned when explicit transcode format is requested
When a client requests transcoding with an explicit format (e.g., format=opus) but no maxBitRate, buildLegacyClientInfo was adding a direct play profile matching the source format. Since there was no bitrate constraint to block it, MakeDecision would match the source against the direct play profile and return the raw file instead of transcoding. This fix only adds the direct play profile when no explicit format was requested (bitrate-only downsampling) or when the requested format matches the source format (allowing direct play when no actual transcoding is needed). |
||
|
|
767744a301
|
refactor: rename core/transcode to core/stream, simplify MediaStreamer (#5166)
* refactor: rename core/transcode directory to core/stream * refactor: update all imports from core/transcode to core/stream * refactor: rename exported symbols to fit core/stream package name * refactor: simplify MediaStreamer interface to single NewStream method Remove the two-method interface (NewStream + DoStream) in favor of a single NewStream(ctx, mf, req) method. Callers are now responsible for fetching the MediaFile before calling NewStream. This removes the implicit DB lookup from the streamer, making it a pure streaming concern. * refactor: update all callers from DoStream to NewStream * chore: update wire_gen.go and stale comment for core/stream rename * refactor: update wire command to handle GO_BUILD_TAGS correctly Signed-off-by: Deluan <deluan@navidrome.org> * fix: distinguish not-found from internal errors in public stream handler * refactor: remove unused ID field from stream.Request * refactor: simplify ResolveRequestFromToken to receive *model.MediaFile Move MediaFile fetching responsibility to callers, making the method focused on token validation and request resolution. Remove ErrMediaNotFound (no longer produced). Update GetTranscodeStream handler to fetch the media file before calling ResolveRequestFromToken. * refactor: extend tokenTTL from 12 to 48 hours Signed-off-by: Deluan <deluan@navidrome.org> --------- Signed-off-by: Deluan <deluan@navidrome.org> |