navidrome/scanner/ignore_checker.go
Deluan Quintão 28d5299ffc
feat(scanner): implement selective folder scanning and file system watcher improvements (#4674)
* feat: Add selective folder scanning capability

Implement targeted scanning of specific library/folder pairs without
full recursion. This enables efficient rescanning of individual folders
when changes are detected, significantly reducing scan time for large
libraries.

Key changes:
- Add ScanTarget struct and ScanFolders API to Scanner interface
- Implement CLI flag --targets for specifying libraryID:folderPath pairs
- Add FolderRepository.GetByPaths() for batch folder info retrieval
- Create loadSpecificFolders() for non-recursive directory loading
- Scope GC operations to affected libraries only (with TODO for full impl)
- Add comprehensive tests for selective scanning behavior

The selective scan:
- Only processes specified folders (no subdirectory recursion)
- Maintains library isolation
- Runs full maintenance pipeline scoped to affected libraries
- Supports both full and quick scan modes

Examples:
  navidrome scan --targets "1:Music/Rock,1:Music/Jazz"
  navidrome scan --full --targets "2:Classical"

* feat(folder): replace GetByPaths with GetFolderUpdateInfo for improved folder updates retrieval

Signed-off-by: Deluan <deluan@navidrome.org>

* test: update parseTargets test to handle folder names with spaces

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(folder): remove unused LibraryPath struct and update GC logging message

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(folder): enhance external scanner to support target-specific scanning

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(scanner): simplify scanner methods

Signed-off-by: Deluan <deluan@navidrome.org>

* feat(watcher): implement folder scanning notifications with deduplication

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(watcher): add resolveFolderPath function for testability

Signed-off-by: Deluan <deluan@navidrome.org>

* feat(watcher): implement path ignoring based on .ndignore patterns

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(scanner): implement IgnoreChecker for managing .ndignore patterns

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(ignore_checker): rename scanner to lineScanner for clarity

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(scanner): enhance ScanTarget struct with String method for better target representation

Signed-off-by: Deluan <deluan@navidrome.org>

* fix(scanner): validate library ID to prevent negative values

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(scanner): simplify GC method by removing library ID parameter

Signed-off-by: Deluan <deluan@navidrome.org>

* feat(scanner): update folder scanning to include all descendants of specified folders

Signed-off-by: Deluan <deluan@navidrome.org>

* feat(subsonic): allow selective scan in the /startScan endpoint

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(scanner): update CallScan to handle specific library/folder pairs

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(scanner): streamline scanning logic by removing scanAll method

Signed-off-by: Deluan <deluan@navidrome.org>

* test: enhance mockScanner for thread safety and improve test reliability

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(scanner): move scanner.ScanTarget to model.ScanTarget

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor: move scanner types to model,implement MockScanner

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(scanner): update scanner interface and implementations to use model.Scanner

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(folder_repository): normalize target path handling by using filepath.Clean

Signed-off-by: Deluan <deluan@navidrome.org>

* test(folder_repository): add comprehensive tests for folder retrieval and child exclusion

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(scanner): simplify selective scan logic using slice.Filter

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(scanner): streamline phase folder and album creation by removing unnecessary library parameter

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(scanner): move initialization logic from phase_1 to the scanner itself

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(tests): rename selective scan test file to scanner_selective_test.go

Signed-off-by: Deluan <deluan@navidrome.org>

* feat(configuration): add DevSelectiveWatcher configuration option

Signed-off-by: Deluan <deluan@navidrome.org>

* feat(watcher): enhance .ndignore handling for folder deletions and file changes

Signed-off-by: Deluan <deluan@navidrome.org>

* docs(scanner): comments

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor(scanner): enhance walkDirTree to support target folder scanning

Signed-off-by: Deluan <deluan@navidrome.org>

* fix(scanner, watcher): handle errors when pushing ignore patterns for folders

Signed-off-by: Deluan <deluan@navidrome.org>

* Update scanner/phase_1_folders.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* refactor(scanner): replace parseTargets function with direct call to scanner.ParseTargets

Signed-off-by: Deluan <deluan@navidrome.org>

* test(scanner): add tests for ScanBegin and ScanEnd functionality

Signed-off-by: Deluan <deluan@navidrome.org>

* fix(library): update PRAGMA optimize to check table sizes without ANALYZE

Signed-off-by: Deluan <deluan@navidrome.org>

* test(scanner): refactor tests

Signed-off-by: Deluan <deluan@navidrome.org>

* feat(ui): add selective scan options and update translations

Signed-off-by: Deluan <deluan@navidrome.org>

* feat(ui): add quick and full scan options for individual libraries

Signed-off-by: Deluan <deluan@navidrome.org>

* feat(ui): add Scan buttonsto the LibraryList

Signed-off-by: Deluan <deluan@navidrome.org>

* feat(scan): update scanning parameters from 'path' to 'target' for selective scans.

* refactor(scan): move ParseTargets function to model package

* test(scan): suppress unused return value from SetUserLibraries in tests

* feat(gc): enhance garbage collection to support selective library purging

Signed-off-by: Deluan <deluan@navidrome.org>

* fix(scanner): prevent race condition when scanning deleted folders

When the watcher detects changes in a folder that gets deleted before
the scanner runs (due to the 10-second delay), the scanner was
prematurely removing these folders from the tracking map, preventing
them from being marked as missing.

The issue occurred because `newFolderEntry` was calling `popLastUpdate`
before verifying the folder actually exists on the filesystem.

Changes:
- Move fs.Stat check before newFolderEntry creation in loadDir to
  ensure deleted folders remain in lastUpdates for finalize() to handle
- Add early existence check in walkDirTree to skip non-existent target
  folders with a warning log
- Add unit test verifying non-existent folders aren't removed from
  lastUpdates prematurely
- Add integration test for deleted folder scenario with ScanFolders

Fixes the issue where deleting entire folders (e.g., /music/AC_DC)
wouldn't mark tracks as missing when using selective folder scanning.

* refactor(scan): streamline folder entry creation and update handling

Signed-off-by: Deluan <deluan@navidrome.org>

* feat(scan): add '@Recycle' (QNAP) to ignored directories list

Signed-off-by: Deluan <deluan@navidrome.org>

* fix(log): improve thread safety in logging level management

* test(scan): move unit tests for ParseTargets function

Signed-off-by: Deluan <deluan@navidrome.org>

* review

Signed-off-by: Deluan <deluan@navidrome.org>

---------

Signed-off-by: Deluan <deluan@navidrome.org>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: deluan <deluan.quintao@mechanical-orchard.com>
2025-11-14 22:15:43 -05:00

164 lines
5.2 KiB
Go

package scanner
import (
"bufio"
"context"
"io/fs"
"path"
"strings"
"github.com/navidrome/navidrome/consts"
"github.com/navidrome/navidrome/log"
ignore "github.com/sabhiram/go-gitignore"
)
// IgnoreChecker manages .ndignore patterns using a stack-based approach.
// Use Push() to add patterns when entering a folder, Pop() when leaving,
// and ShouldIgnore() to check if a path should be ignored.
type IgnoreChecker struct {
fsys fs.FS
patternStack [][]string // Stack of patterns for each folder level
currentPatterns []string // Flattened current patterns
matcher *ignore.GitIgnore // Compiled matcher for current patterns
}
// newIgnoreChecker creates a new IgnoreChecker for the given filesystem.
func newIgnoreChecker(fsys fs.FS) *IgnoreChecker {
return &IgnoreChecker{
fsys: fsys,
patternStack: make([][]string, 0),
}
}
// Push loads .ndignore patterns from the specified folder and adds them to the pattern stack.
// Use this when entering a folder during directory tree traversal.
func (ic *IgnoreChecker) Push(ctx context.Context, folder string) error {
patterns := ic.loadPatternsFromFolder(ctx, folder)
ic.patternStack = append(ic.patternStack, patterns)
ic.rebuildCurrentPatterns()
return nil
}
// Pop removes the most recent patterns from the stack.
// Use this when leaving a folder during directory tree traversal.
func (ic *IgnoreChecker) Pop() {
if len(ic.patternStack) > 0 {
ic.patternStack = ic.patternStack[:len(ic.patternStack)-1]
ic.rebuildCurrentPatterns()
}
}
// PushAllParents pushes patterns from root down to the target path.
// This is a convenience method for when you need to check a specific path
// without recursively walking the tree. It handles the common pattern of
// pushing all parent directories from root to the target.
// This method is optimized to compile patterns only once at the end.
func (ic *IgnoreChecker) PushAllParents(ctx context.Context, targetPath string) error {
if targetPath == "." || targetPath == "" {
// Simple case: just push root
return ic.Push(ctx, ".")
}
// Load patterns for root
patterns := ic.loadPatternsFromFolder(ctx, ".")
ic.patternStack = append(ic.patternStack, patterns)
// Load patterns for each parent directory
currentPath := "."
parts := strings.Split(path.Clean(targetPath), "/")
for _, part := range parts {
if part == "." || part == "" {
continue
}
currentPath = path.Join(currentPath, part)
patterns = ic.loadPatternsFromFolder(ctx, currentPath)
ic.patternStack = append(ic.patternStack, patterns)
}
// Rebuild and compile patterns only once at the end
ic.rebuildCurrentPatterns()
return nil
}
// ShouldIgnore checks if the given path should be ignored based on the current patterns.
// Returns true if the path matches any ignore pattern, false otherwise.
func (ic *IgnoreChecker) ShouldIgnore(ctx context.Context, relPath string) bool {
// Handle root/empty path - never ignore
if relPath == "" || relPath == "." {
return false
}
// If no patterns loaded, nothing to ignore
if ic.matcher == nil {
return false
}
matches := ic.matcher.MatchesPath(relPath)
if matches {
log.Trace(ctx, "Scanner: Ignoring entry matching .ndignore", "path", relPath)
}
return matches
}
// loadPatternsFromFolder reads the .ndignore file in the specified folder and returns the patterns.
// If the file doesn't exist, returns an empty slice.
// If the file exists but is empty, returns a pattern to ignore everything ("**/*").
func (ic *IgnoreChecker) loadPatternsFromFolder(ctx context.Context, folder string) []string {
ignoreFilePath := path.Join(folder, consts.ScanIgnoreFile)
var patterns []string
// Check if .ndignore file exists
if _, err := fs.Stat(ic.fsys, ignoreFilePath); err != nil {
// No .ndignore file in this folder
return patterns
}
// Read and parse the .ndignore file
ignoreFile, err := ic.fsys.Open(ignoreFilePath)
if err != nil {
log.Warn(ctx, "Scanner: Error opening .ndignore file", "path", ignoreFilePath, err)
return patterns
}
defer ignoreFile.Close()
lineScanner := bufio.NewScanner(ignoreFile)
for lineScanner.Scan() {
line := strings.TrimSpace(lineScanner.Text())
if line == "" || strings.HasPrefix(line, "#") {
continue // Skip empty lines, whitespace-only lines, and comments
}
patterns = append(patterns, line)
}
if err := lineScanner.Err(); err != nil {
log.Warn(ctx, "Scanner: Error reading .ndignore file", "path", ignoreFilePath, err)
return patterns
}
// If the .ndignore file is empty, ignore everything
if len(patterns) == 0 {
log.Trace(ctx, "Scanner: .ndignore file is empty, ignoring everything", "path", folder)
patterns = []string{"**/*"}
}
return patterns
}
// rebuildCurrentPatterns flattens the pattern stack into currentPatterns and recompiles the matcher.
func (ic *IgnoreChecker) rebuildCurrentPatterns() {
ic.currentPatterns = make([]string, 0)
for _, patterns := range ic.patternStack {
ic.currentPatterns = append(ic.currentPatterns, patterns...)
}
ic.compilePatterns()
}
// compilePatterns compiles the current patterns into a GitIgnore matcher.
func (ic *IgnoreChecker) compilePatterns() {
if len(ic.currentPatterns) == 0 {
ic.matcher = nil
return
}
ic.matcher = ignore.CompileIgnoreLines(ic.currentPatterns...)
}