obsolete.computer/geekery/

Nightly YouTube & Odysee Download Script

Since my internet provider is so terrible, I struggle to watch videos directly on the web. Occasionally it works, but my patience grows thin every time I try. So, of course I wrote a script that downloads subscriptions from my favorite channels in the middle of the night from both YouTube and Odysee. Hopefully someone out there will find it useful. It uses youtube-dl so naturally that'll have to be installed on your system for it to work.

It took me a while to get it just right and work around youtube-dl's quirks. It's tweaked to fit my particular needs: I wanted it to check each channel for a new video, and download only the latest video unless it's already been downloaded. I wanted to be able to set a max number of downloads per night, because my ISP places a separate cap on the data used at night time. I also wanted the ability to download certain channels from beginnning to end, something I called archive mode. Lastly, there are times that I wanted to grab only videos containing a certain phrase in the title, such as "Full Podcast" or the like. So I added the ability to use a different title filter for each channel, specified in the urls.txt config file. And of course I didn't want it eating up all my hard drive space, so I added a cleanup routine that gets rid of anything older than two weeks. Depending on your usage, you may want to tweak this value in the settings (top of the script).

Lastly, I wanted a .m3u playlist to be generated every day, just for the videos that were downloaded that day. And because I share the Videos folder over the network using minidlna, allowing me to watch using the Roku media player app, I also had to make it symlink each video file into the folders where the playlists are stored. (You'll see what I mean if you try the script.)

Honestly, I've found that 'browsing' internet videos this way is very satisfying. No ads, no recommended videos or other distractions, just the content I want to see, all immediately accessible -- and something new every day. Even if we ever get Starlink in our area, I am pretty sure I'll keep using this script, especially since I don't have a YouTube account.

Let's do it

To use the script, first make a folder at $HOME/.config/download-yt-subs or else tweak the $CONFIGDIR variable to point wherever you want your config to be stored.

Next, create a file in the $CONFIGDIR called urls.txt. The syntax should be something like this:

urls.txt

Download Source

#URL[;Title Filter][;Number of playlist items to check]

#8-bit Guy - regular "just fetch the latest episode, and delete when the video is older than the date specified by $FIRSTDATE"
https://www.youtube.com/channel/UC8uT9cgJorJPWu7ITLGo9Ww

#Audiotree - title must contain "Full Sessions", check back 10 playlist items each script run rather than the default 5
https://www.youtube.com/channel/UCWjmAUHmajb1-eo5WKk_22A;Full Session;10

#Gaming Historian - 9999 = archive mode, i.e. start from the beginning and download every video, and don't clean up
https://www.youtube.com/channel/UCnbvPS_rXp4PC21PG2k1UVg;;9999

#3Blue1Brown - Odysee channel example
https://odysee.com/@3Blue1Brown:b

#HexDSL - Bitchute playlists work too
https://www.bitchute.com/channel/hexdsl/

Then, download the script below into ~/bin or wherever you keep your scripts, and of course chmod +x it. After doing some test runs, you can add it to your personal crontab file.

As always, please examine the script to understand what it does before you run it. Enjoy!

download-yt-subs.sh

Download Source

#!/bin/bash

#This script will download one video for each channel in a list. See below for settings.
#After downloading $MAXDOWNLOADS videos, m3u playlists as well as folders full of symlinks are generated
#for all of the videos downloaded on that day. This script is meant to be run once per day
#(i.e. in the middle of the night). Playlists older than $FIRSTDATE are cleaned up.
#You can pass "--skip-downloads" as $1 if you just want to regenerate the playslists without downloading anything.

which youtube-dl > /dev/null || exit 1

#Place your config file here. The archive file will also be placed here.
CONFIGDIR="$HOME/.config/download-yt-subs"
#File with URLS to check
URLSFILE="$CONFIGDIR/urls.txt"
#Archive file which keeps track of already downloaded videos
ARCHIVEFILE="$CONFIGDIR/archive-$HOSTNAME.txt"
#Temp file, used to keep track of total downloads
TEMPFILE="/tmp/`basename "$0"`.$$"
#Where to store regular downloads
SHOWSFOLDER="$HOME/Videos/Internet-Shows"
#Where to store archive-mode downloads (this folder doesn't get cleaned up)
ARCHIVEFOLDER="$HOME/Videos/Internet-Archives"
#where to put generated playlists and symlinks
PLAYLISTFOLDER="$HOME/Videos/Daily-Playlists"
#Don't download or keep videos older than this (unless in archive mode)
FIRSTDATE="`date --date='-2 weeks' +%Y%m%d`"
#Don't download anything published after this date
LASTDATE="`date +%Y%m%d`"
#Filename template (see youtube-dl docs)
FILETEMPLATE='%(playlist)s/%(upload_date)s-%(title)s.%(ext)s'
#Downloads per channel per script execution.
URLDOWNLOADS=1
#Total downloads per script execution.
MAXDOWNLOADS=15
#Check back this many videos in the playlist (can be overridden in the urls.txt file)
PLAYLISTEND=5
#see youtube-dl docs for valid format strings
FORMAT="[height<=480]/worst"
#don't download currently live videos
FILTER="!is_live"
#700M = roughtly three hours twenty minutes
MAXFILESIZE=700M

#Start all the downloadin'
mkdir -p "$SHOWSFOLDER" || exit 1
mkdir -p "$ARCHIVEFOLDER" || exit 1
echo "" > "$TEMPFILE" || exit 1
if [ "$1" != "--skip-downloads" ]; then
grep -vE '^(\s*$|#)' "$URLSFILE" | while IFS=';' read -ra LINE; do
    URL=${LINE[0]}
    TITLEFILTER=${LINE[1]}
    if [ "$TITLEFILTER" = "" ]; then
        TITLEFILTER=".*"
    fi
    URLPLAYLISTEND=${LINE[2]}
    if [ "$URLPLAYLISTEND" = "" ]; then
        URLPLAYLISTEND=$PLAYLISTEND
    fi
    if [ "$URLPLAYLISTEND" = "9999" ]; then
        URLREVERSE="--playlist-reverse"
        URLFIRSTDATE="19800101"
        URLTARGETFOLDER="$ARCHIVEFOLDER"
    else
        URLREVERSE=""
        URLFIRSTDATE="$FIRSTDATE"
        URLTARGETFOLDER="$SHOWSFOLDER"
    fi

    youtube-dl \
        --socket-timeout 15 \
        --download-archive "$ARCHIVEFILE" \
        --dateafter "$URLFIRSTDATE" \
        --max-downloads $URLDOWNLOADS \
        --max-filesize $MAXFILESIZE \
        --playlist-end $URLPLAYLISTEND \
        --match-filter "$FILTER" \
        --match-title "$TITLEFILTER" \
        --output "$URLTARGETFOLDER/$FILETEMPLATE" \
        --restrict-filenames \
        --format "$FORMAT" \
        --no-progress \
        --no-mtime \
        --ignore-errors \
        --force-ipv4 \
        $URLREVERSE \
        "`echo $URL | tr -d ' '`" | tee -a "$TEMPFILE" \
#       --simulate --verbose \

    TOTALDOWNLOADS=`grep -o 'Download completed' $TEMPFILE | wc -l`
    echo "Total downloads so far: $TOTALDOWNLOADS"
    if [ $TOTALDOWNLOADS -ge $MAXDOWNLOADS ]; then
        echo "Max downloads ($MAXDOWNLOADS) reached."
        break
    fi
done
fi

#Clean up playlists - by filename
find "$PLAYLISTFOLDER" \
    -type f \
    -regextype posix-egrep -regex ".*\/[0-9]{8}[^/]*(\.m3u)" \
    -exec bash -c 'fn=${0##*/}; d=${fn:0:8}; [[ $d -lt $1 ]] && echo Removing "$0" && rm "$0"' {} $FIRSTDATE \;

#Clean up - by modified date (if no date in filename)
find "$SHOWSFOLDER" \
    -type f \
    -regextype posix-egrep -regex ".*(\.mp4|\.webm)" \
    ! -newermt "$FIRSTDATE" \
    -exec bash -c 'echo Removing "$0" && rm "$0"' {} \;

#Remove Empty Directories and broken Symlinks
find "$SHOWSFOLDER" -empty -type d -delete
find -L "$PLAYLISTFOLDER" -type l -delete
find "$PLAYLISTFOLDER" -empty -type d -delete


#Build playlists
echo "Building Playlists..."
mkdir -p "$PLAYLISTFOLDER" || exit 1
d=$FIRSTDATE
while [ $d -le $LASTDATE ]; do
    mkdir -p "$PLAYLISTFOLDER/$d" || exit 1

    #Build .m3u playlist from upload date (built into filename)
    #(I didn't find this useful, but uncomment if you'd like this feature)
#   find "$SHOWSFOLDER" \
#       -type f \
#       -iname "*.mp4" \
#       -name "$d*" > "$PLAYLISTFOLDER/$d/$d-Uploads.m3u"

    #Build .m3u playlist from download date (file modified time)
    find "$SHOWSFOLDER" \
        -type f \
        -iname "*.mp4" \
        -newermt "$d" ! -newermt "`date --date="$d +1 day" +%Y%m%d`" > "$PLAYLISTFOLDER/$d/$d-All-Downloads.m3u"
    find "$ARCHIVEFOLDER" \
        -type f \
        -iname "*.mp4" \
        -newermt "$d" ! -newermt "`date --date="$d +1 day" +%Y%m%d`" >> "$PLAYLISTFOLDER/$d/$d-All-Downloads.m3u"

    #Make a symlink for each playlist entry
    #(useful if you share this folder via DLNA or SMB to your Roku)
    cat "$PLAYLISTFOLDER/$d/$d-All-Downloads.m3u" | while read LINE; do
        ln -s -f "$LINE" "$PLAYLISTFOLDER/$d/$(basename $(dirname "$LINE"))-$(basename "$LINE")"
    done

    d=$(date --date="$d +1 day" +%Y%m%d)
done
echo "Done."

[ -f "$TEMPFILE" ] && rm -f "$TEMPFILE"