I love Dave Ramsey. His daily shows are streamed live everyday on YouTube and then made available to watch there. The “Debt Free Screams” are my favorite part of the show. Because I usually don’t have the time watch the entire episode I would like to find those short segments that contain the “Debt Free Screams”.

The Challenge: Search a youtube video for a specific phrase and return a list of timestamps for when that phrase was spoken.
The Solution: Behold… Daily Debt Free Screams

#! /bin/bash
#
# This script is configured to search for the phrase "in the lobby of", needs more work
# to make the search phrase configurable
#
# Usage: ./getYoutubeTimestampOfPhrase.sh <youtube-url>
# Output: A JSON with the timestamp of each occurrence of the phrase. e.g. 
# {"id":"9gyLR0OR1jM","timestamp":"5712","title":"The Dave Ramsey Show (07-13-17)"}, 
# {"id":"9gyLR0OR1jM","timestamp":"9314","title":"The Dave Ramsey Show (07-13-17)"}
# 
# Note: you must pre install youtube-dl (https://github.com/rg3/youtube-dl)
#

# First we'll download the captions as a .vtt file
/usr/local/bin/youtube-dl --write-auto-sub --skip-download -o '%(title)s_%(id)s.%(ext)s' "$1" #&> /dev/null

# setup some config variables
youtube="https://www.youtube.com/watch?v="
BASEDIR=$(dirname $0)
queue_files="${BASEDIR}/*.en.vtt"
json=""

for queue_file in $queue_files; do
  if [[ ! -f "$queue_file" ]]; then
    continue
  fi
  timestamp=$(egrep 'in<[[:digit:]\:\.]*><c> the<\/c><[[:digit:]\:\.]*><c> lobby<\/c><[[:digit:]\:\.]*><c> of' "$queue_file" | cut -c4-11)
  if [[ ! -z "$timestamp" ]]; then
    newtimes=$(printf %s "$timestamp" | awk -F: '{ print ($1 * 3600) + ($2 * 60) + $3 }')
    for newtime in $newtimes; do
      videoId=$(echo "$queue_file" | cut -c3- | cut -d "." -f 1 | cut -d "_" -f 2)
      title=$(echo "$queue_file" | cut -c3- | cut -d "." -f 1 | cut -d "_" -f 1)
      videolink="${youtube}${videoId}&t=${newtime}"
      videoData=$(printf '{"id":"%s","timestamp":"%s","title":"%s"}\n' "$videoId" "$newtime" "$title")
      json="${json}${videoData},"
      echo $videolink
    done
  fi
  rm "$queue_file"
done

echo "${json%?}" >> out.json

Leave a Reply

Your email address will not be published. Required fields are marked *