obsolete.computer/geekery/

Nightly YouTube & Odysee Download Script

Since my internet provider is so terrible, I struggle to watch videos directly on the web. Occasionally it works, but my patience grows thin every time I try. So, of course I wrote a script that downloads subscriptions from my favorite channels in the middle of the night from both YouTube and Odysee. Hopefully someone out there will find it useful. It uses youtube-dl so naturally that'll have to be installed on your system for it to work.

It took me a while to get it just right and work around youtube-dl's quirks. It's tweaked to fit my particular needs: I wanted it to check each channel for a new video, and download only the latest video unless it's already been downloaded. I wanted to be able to set a max number of downloads per night, because my ISP places a separate cap on the data used at night time. I also wanted the ability to download certain channels from beginnning to end, something I called archive mode. Lastly, there are times that I wanted to grab only videos containing a certain phrase in the title, such as "Full Podcast" or the like. So I added the ability to use a different title filter for each channel, specified in the urls.txt config file. And of course I didn't want it eating up all my hard drive space, so I added a cleanup routine that gets rid of anything older than two weeks. Depending on your usage, you may want to tweak this value in the settings (top of the script).

Lastly, I wanted a .m3u playlist to be generated every day, just for the videos that were downloaded that day. And because I share the Videos folder over the network using minidlna, allowing me to watch using the Roku media player app, I also had to make it symlink each video file into the folders where the playlists are stored. (You'll see what I mean if you try the script.)

Honestly, I've found that 'browsing' internet videos this way is very satisfying. No ads, no recommended videos or other distractions, just the content I want to see, all immediately accessible -- and something new every day. Even if we ever get Starlink in our area, I am pretty sure I'll keep using this script, especially since I don't have a YouTube account.

Let's do it

To use the script, first make a folder at $HOME/.config/download-video-subs or else tweak the $CONFIGDIR variable to point wherever you want your config to be stored.

Next, create a file in the $CONFIGDIR called urls.txt. The syntax should be something like this:


#URL[;Title Filter][;Number of playlist items to check][;Format Override]

#8-bit Guy - regular "just fetch the latest episode, and delete when the video is older than the date specified by $FIRSTDATE"
https://www.youtube.com/channel/UC8uT9cgJorJPWu7ITLGo9Ww/videos

#Audiotree - title must contain "Full Sessions", check back 10 playlist items each script run rather than the default 5
https://www.youtube.com/channel/UCWjmAUHmajb1-eo5WKk_22A;Full Session;10

#Dave Smith - Part of the Problem
https://www.youtube.com/channel/UCEfe80CP2cs1eLRNQazffZw/videos

###Epic Family Road Trip - with format override (need better video quality for this channel)
https://www.youtube.com/channel/UC1Az_80tfW-1uEQBlUGXnww/videos;;;[height>=720]/best

#3Blue1Brown - Odysee channel example
https://odysee.com/@3Blue1Brown:b

Then, download the script below into ~/bin or wherever you keep your scripts, and of course chmod +x it. After doing some test runs, you can add it to your personal crontab file.

As always, please examine the script to understand what it does before you run it. Enjoy!


#!/bin/bash

#This script will download one video for each channel in a list. See below for settings.
#After downloading $MAXDOWNLOADS videos, m3u8 playlists as well as folders full of symlinks are generated
#for all of the videos downloaded on that day. This script is meant to be run once per day
#(i.e. in the middle of the night). Playlists older than $CLEANUPDATE are cleaned up.
#You can pass "--skip-downloads" if you just want to regenerate the playslists without downloading anything, 
#and/or "--skip-playlists" to skip building the per-day playlists.

# #Your urls.txt file should take the following format... one URL for each line (begin comment lines with a #):
#
# #URL[;Title Filter][;Number of playlist items to check][;Format Override]

SCRIPTDIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"

#YTDL=/usr/local/bin/yt-dlp
YTDL="$SCRIPTDIR/yt-dlp"
[[ -x "$YTDL" ]] || exit 1

SKIPDOWNLOADS=false
SKIPPLAYLISTS=false
DOWNLOADSBYDATE=false

while test $# -gt 0
do
    case "$1" in
        --skip-downloads) SKIPDOWNLOADS=true
            ;;
        --do-downloads) SKIPDOWNLOADS=false
            ;;
        --skip-playlists) SKIPPLAYLISTS=true
            ;;
        --do-playlists) SKIPPLAYLISTS=false
            ;;
        --downloads-by-date) DOWNLOADSBYDATE=true
            ;;
        --downloads-by-playlist) DOWNLOADSBYDATE=false
            ;;
        --*) echo "bad option $1"
            ;;
        *) echo "argument $1"
            ;;
    esac
    shift
done

#File with URLS to check
URLSFILE="$SCRIPTDIR/urls-sean.txt"
#Archive file which keeps track of already downloaded videos
ARCHIVEFILE="$SCRIPTDIR/archive-$HOSTNAME.txt"
#Temp file, used to keep track of total downloads
TEMPFILE="/tmp/`basename "$0"`.$$"
#Where to store regular downloads
SHOWSFOLDER="$HOME/Videos/Internet-Shows"
#where to put generated playlists and symlinks
PLAYLISTFOLDER="$SHOWSFOLDER"
#Don't download videos older than this
FIRSTDATE="`date --date='-1 month' +%Y%m%d`"
#Don't download anything published after this date
LASTDATE="`date +%Y%m%d`"
#Don't keep videos longer than this
CLEANUPDATE="`date --date='-2 weeks' +%Y%m%d`"
#Filename template (see youtube-dl docs)
if [[ $DOWNLOADSBYDATE = false ]]; then
    FILETEMPLATE="%(playlist)s/%(upload_date)s-%(title)s.%(ext)s"
else
    FILETEMPLATE="$LASTDATE/%(playlist)s-%(title)s.%(ext)s"
fi
#Downloads per channel per script execution.
URLDOWNLOADS=1
#Total downloads per script execution.
MAXDOWNLOADS=15
#Check back this many videos in the playlist (can be overridden in the urls.txt file)
PLAYLISTEND=5
#see youtube-dl docs for valid format strings
FORMAT="best[height<=720]/best[height<=1080]"
#don't download currently live videos
FILTER="!is_live"
#850M = roughtly four hours
MAXFILESIZE=1024M
#Root URL For serving via HTTP
SHOWSROOTURL="http://$(hostname)/internet-shows"
#File types to clean up
DOWNLOADCLEANUPFILETYPES="(\.mp4|\.m4v|\.webm|\.part|\.md|\.jpeg|\.jpg|\.ytdl|\.vtt)"
PLAYLISTCLEANUPFILETYPES="(\.m3u|\.m3u8)"


#Start all the downloadin'
if [[ $SKIPDOWNLOADS = false ]]; then
    echo "Starting Downloads..."
    echo "" > "$TEMPFILE" || exit 1
    mkdir -p "$SHOWSFOLDER" || exit 1   

    grep -vE '^(\s*$|#)' "$URLSFILE" | while IFS=';' read -ra LINE; do
        URL=${LINE[0]}
        TITLEFILTER=${LINE[1]}
        if [[ "$TITLEFILTER" = "" ]]; then
            TITLEFILTER=".*"
        fi
        URLPLAYLISTEND=${LINE[2]}
        if [[ "$URLPLAYLISTEND" = "" ]]; then
            URLPLAYLISTEND=$PLAYLISTEND
        fi
        URLFORMAT=${LINE[3]}
        if [[ "$URLFORMAT" = "" ]]; then
            URLFORMAT="$FORMAT"
        fi

        $YTDL \
            --socket-timeout 30 \
            --download-archive "$ARCHIVEFILE" \
            --dateafter "$FIRSTDATE" \
            --max-downloads $URLDOWNLOADS \
            --max-filesize $MAXFILESIZE \
            --playlist-end $URLPLAYLISTEND \
            --match-filter "$FILTER" \
            --match-title "$TITLEFILTER" \
            --output "$SHOWSFOLDER/$FILETEMPLATE" \
            --restrict-filenames \
            --format "$URLFORMAT" \
            --no-progress \
            --no-mtime \
            --ignore-errors \
            --no-overwrites \
            --continue \
            --force-ipv4 \
            "`echo $URL | tr -d ' '`" | tee -a "$TEMPFILE" \

    #       --write-sub --write-auto-sub --sub-lang "en.*" \
    #       --simulate --verbose \

        TOTALDOWNLOADS=`grep -o 'Download completed' $TEMPFILE | wc -l`
        echo "Total downloads so far: $TOTALDOWNLOADS"
        if [ $TOTALDOWNLOADS -ge $MAXDOWNLOADS ]; then
            echo "Max downloads ($MAXDOWNLOADS) reached."
            break
        fi
    done

    #Clean up downloads folder
    echo "Cleaning Up Old Downloads..."
    find "$SHOWSFOLDER" \
        -type f \
        -regextype posix-egrep -regex ".*$DOWNLOADCLEANUPFILETYPES" \
        ! -newermt "$CLEANUPDATE" \
        -exec bash -c 'echo Removing "$0" && rm "$0"' {} \;

    #Remove Empty Directories and broken Symlinks
    find "$SHOWSFOLDER" -empty -type d -delete
else
    echo "Skipped Downloads"
fi

#Build playlists
if [[ $SKIPPLAYLISTS = false ]]; then
    echo "Building Playlists..."
    mkdir -p "$PLAYLISTFOLDER" || exit 1
    if [[ $DOWNLOADSBYDATE = false ]]; then
        d=$CLEANUPDATE
    else
        d=$LASTDATE
    fi
    while [ $d -le $LASTDATE ]; do
        mkdir -p "$PLAYLISTFOLDER/$d" || exit 1

        echo '#EXTM3U' > "$PLAYLISTFOLDER/$d/$d.m3u8"

        #Build .m3u8 playlist from download date (file modified time)
        find "$SHOWSFOLDER" \
            -type f \
            -iname "*.mp4" \
            -newermt "$d" ! -newermt "`date --date="$d +1 day" +%Y%m%d`" >> "$PLAYLISTFOLDER/$d/$d.m3u8"

        #Make a symlink for each playlist entry
        #(useful if you share this folder via DLNA or SMB to your Roku,
        #and unnecessary if the downloads are grouped by date)
        if [[ $DOWNLOADSBYDATE = false ]]; then
            cat "$PLAYLISTFOLDER/$d/$d.m3u8" | while read LINE; do
                [[ "$LINE" != "#EXTM3U" ]] && ln -s -f "$LINE" "$PLAYLISTFOLDER/$d/$(basename $(dirname "$LINE"))-$(basename "$LINE")"
            done
        fi

        #Create an HTTP-served version of the playlist.
        sed "s#${SHOWSFOLDER}#${SHOWSROOTURL}#g" "$PLAYLISTFOLDER/$d/$d.m3u8" > "$PLAYLISTFOLDER/$d/$d-http.m3u8"

        #If the downloads and playlists are in the same folder, use relative paths."
        if [[ $SHOWSFOLDER = $PLAYLISTFOLDER ]]; then
            sed -i "s#${SHOWSFOLDER}#\.\.#g" "$PLAYLISTFOLDER/$d/$d.m3u8"
        fi

        d=$(date --date="$d +1 day" +%Y%m%d)
    done

    echo "Cleaning Up Old Playlists..."
    #Clean up playlists - by filename
    find "$PLAYLISTFOLDER" \
        -type f \
        -regextype posix-egrep -regex ".*\/[0-9]{8}[^/]*$PLAYLISTCLEANUPFILETYPES" \
        -exec bash -c 'fn=${0##*/}; d=${fn:0:8}; [[ $d -lt $1 ]] && echo Removing "$0" && rm "$0"' {} $CLEANUPDATE \;

    #Clean up - by modified date (if no date in filename)
    find "$PLAYLISTFOLDER" \
        -type f \
        -regextype posix-egrep -regex ".*$PLAYLISTCLEANUPFILETYPES" \
        ! -newermt "$CLEANUPDATE" \
        -exec bash -c 'echo Removing "$0" && rm "$0"' {} \;

    #Remove Empty Directories and broken Symlinks
    find -L "$PLAYLISTFOLDER" -type l -delete
    find "$PLAYLISTFOLDER" -empty -type d -delete
else
    echo "Skipped Building Playlists"
fi

echo "Done."

[[ -f "$TEMPFILE" ]] && rm -f "$TEMPFILE"



Modified Wednesday, February 19, 2025