this post was submitted on 29 Dec 2025
1 points (100.0% liked)

ShareGPT

89 readers
1 users here now

To share LLM text output that others might find interesting.

founded 2 years ago
MODERATORS
 

You can solve this problem on Linux, but the best approach isn’t a “placeholder file” trick — it’s using a consistent way of tracking what you’ve already downloaded, and then comparing new downloads against that record before accepting them.

Below are practical solutions with increasing levels of automation.


1) Track Files by Hash (Best General Solution)

Instead of tracking filenames (which change when you move files), track file content using a cryptographic hash (MD5/SHA1/SHA256). You build a database of hashes for everything you’ve downloaded; before accepting a new download, compare its hash to the database.

How it works

  1. Every time you add a downloaded file to your archive, compute a hash:

    sha256sum /path/to/song.mp3 >> ~/download_hashes.txt
    
  2. When a new file arrives, compute its hash:

    sha256sum newfile.mp3
    
  3. Check if the hash exists in your index:

    grep -F "$(sha256sum newfile.mp3 | cut -d' ' -f1)" ~/download_hashes.txt
    
  4. If there's a match, skip/ignore it.

Automating this

You can write a simple Bash script to do the check before you move a file:

#!/bin/bash

HASHFILE=~/download_hashes.txt

for f in "$@"; do
    h=$(sha256sum "$f" | cut -d ' ' -f 1)
    if grep -qx "$h" "$HASHFILE"; then
        echo "Already downloaded: $f"
    else
        echo "$h" >> "$HASHFILE"
        mv "$f" ~/Music/Library/
        echo "Added: $f"
    fi
done

Usage:

download-check file1.mp3 file2.mp3 …

2) Use a File Indexing Tool

If you prefer a tool-based approach, use a filesystem indexer like recoll, Tracker, or Beagle:

  • These tools index metadata and file contents.
  • You can query the index for song titles/artist names.
  • You can script queries against the index to see if a new download already exists.

Example with grep on an indexed database isn’t as reliable as hashes, but it helps.


3) Soulseek-Specific Strategies

Soulseek clients don’t typically track moved files automatically because they only see the download directory.

Workarounds:

A) Keep your archive and download directory separate

  • Configure Soulseek to download into a dedicated “incoming” folder.
  • After moving to your archive, immediately run your hash script.
  • Soulseek won’t know where the file went, but your system does.

B) Use symlinks/placeholder files

In theory you could leave small “marker” files where the original download was:

song.mp3     -> real file in /Music/Library/song.mp3

But this is fragile:

  • You must keep the filename the same.
  • Soulseek may overwrite or delete the placeholder.
  • You’d need Soulseek settings to never replace existing files.

In practice, the hash-based index approach is safer.


4) Automatic Deduplication Tools

If you want periodic cleanup rather than manual checking:

fdupes / rdfind / duff

These tools scan directories and identify duplicates by content.

Example:

fdupes -r /Music/Library

Advanced script:

fdupes -r ~/Downloads ~/Music/Library

to find duplicates across download and library.


5) A Full Workflow Example

  1. Soulseek downloads to /home/user/Downloads/soulseek-incoming

  2. Run a script:

    • Compute hashes for new files
    • Check against ~/download_hashes.txt
    • Move unique files to /home/user/Music/Library
    • Remove duplicates automatically
  3. Optionally index your library with Tracker/recoll to support search.


Summary of Options

Method Tracks Moves? Automatable Soulseek-aware
Placeholder files No (fragile) Low Limited
Hash database Yes High Independent
Indexing tool Yes Moderate Independent
Deduplication scan Yes (after the fact) High Independent

Recommendation

Use a hash index. It reliably identifies duplicates regardless of filename changes or directory moves, and it can be fully automated with scripts.

no comments (yet)
sorted by: hot top controversial new old
there doesn't seem to be anything here