{"id":10548,"date":"2023-12-13T21:16:00","date_gmt":"2023-12-14T02:16:00","guid":{"rendered":"https:\/\/www.rushworth.us\/lisa\/?p=10548"},"modified":"2023-12-26T19:55:34","modified_gmt":"2023-12-27T00:55:34","slug":"bulk-download-of-youtube-videos-from-channel","status":"publish","type":"post","link":"https:\/\/www.rushworth.us\/lisa\/?p=10548","title":{"rendered":"Bulk Download of YouTube Videos from Channel"},"content":{"rendered":"\n<p>Several years ago, I started recording our Township meetings and posting them to YouTube. This was very helpful &#8212; even our government officials used the recordings to refresh their memory about what happened in a meeting. But it also led people to ask &#8220;why, exactly, are we relying on some random citizen to provide this service? What if they are busy? Or move?!&#8221; &#8230; and the Township created their own channel and posted their meeting recordings. This was a great way to promote transparency <em>however<\/em> they&#8217;ve got retention policies. Since we have absolutely been at meetings where it would be very helpful to know what happened five, ten, forty!! years ago &#8230; my expectation is that these videos will be useful far beyond the allotted document retention period. <\/p>\n\n\n\n<p>We decided to keep our channel around with the historic archive of government meeting recordings. There&#8217;s no longer time criticality &#8212; anyone who wants to see a current meeting can just use the township&#8217;s channel. We have a script that lists all of the videos from the township&#8217;s channel and downloads them &#8212; once I complete back-filling our archive, I will modify the script to <em>stop<\/em> once it reaches a video series we already have. But this quick script will list all videos published to a channel and download the highest quality MP4 file associated with that video. <\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\n# API key for my Google Developer project\nstrAPIKey = &#039;&lt;CHANGEIT&gt;&#039;\n\n# Youtube account channel ID\nstrChannelID = &#039;&lt;CHANGEIT&gt;&#039;\n\nimport os\nfrom time import sleep\nimport urllib\nfrom urllib.request import urlopen\nimport json\nfrom pytube import YouTube\nimport datetime\n\nfrom config import dateLastDownloaded\n\nos.chdir(os.path.dirname(os.path.abspath(__file__)))\nprint(os.getcwd())\n\nstrBaseVideoURL = &#039;https:\/\/www.youtube.com\/watch?v=&#039;\nstrSearchAPIv3URL= &#039;https:\/\/www.googleapis.com\/youtube\/v3\/search?&#039;\n\niStart = 0\t\t# Not used -- included to allow skipping first N files when batch fails midway\niProcessed = 0\t\t# Just a counter\n\nstrStartURL = f&quot;{strSearchAPIv3URL}key={strAPIKey}&amp;channelId={strChannelID}&amp;part=snippet,id&amp;order=date&amp;maxResults=50&quot;\nstrYoutubeURL = strStartURL\n\nwhile True:\n    inp = urllib.request.urlopen(strYoutubeURL)\n    resp = json.load(inp)\n\n    for i in resp&#x5B;&#039;items&#039;]:\n        if i&#x5B;&#039;id&#039;]&#x5B;&#039;kind&#039;] == &quot;youtube#video&quot;:\n            iDaysSinceLastDownload = datetime.datetime.strptime(i&#x5B;&#039;snippet&#039;]&#x5B;&#039;publishTime&#039;], &quot;%Y-%m-%dT%H:%M:%SZ&quot;) - dateLastDownloaded\n            # If video was posted since last run time, download the video\n            if iDaysSinceLastDownload.days &gt;= 0:\n                strFileName = (i&#x5B;&#039;snippet&#039;]&#x5B;&#039;title&#039;]).replace(&#039;\/&#039;,&#039;-&#039;).replace(&#039; &#039;,&#039;_&#039;)\n                print(f&quot;{iProcessed}\\tDownloading file {strFileName} from {strBaseVideoURL}{i&#x5B;&#039;id&#039;]&#x5B;&#039;videoId&#039;]}&quot;)\n                # Need to retrieve a youtube object and filter for the *highest* resolution otherwise we get blurry videos\n                if iProcessed &gt;= iStart:\n                    yt = YouTube(f&quot;{strBaseVideoURL}{i&#x5B;&#039;id&#039;]&#x5B;&#039;videoId&#039;]}&quot;)\n                    yt.streams.filter(progressive=True, file_extension=&#039;mp4&#039;).order_by(&#039;resolution&#039;).desc().first().download(filename=f&quot;{strFileName}.mp4&quot;)\n                    sleep(90)\n                iProcessed = iProcessed + 1\n    try:\n        next_page_token = resp&#x5B;&#039;nextPageToken&#039;]\n        strYoutubeURL = strStartURL + &#039;&amp;pageToken={}&#039;.format(next_page_token)\n        print(f&quot;Now getting next page from {strYoutubeURL}&quot;)\n    except:\n        break\n\n# Update config.py with last run date\nf = open(&quot;config.py&quot;,&quot;w&quot;)\nf.write(&quot;import datetime\\n&quot;)\nf.write(f&quot;dateLastDownloaded = datetime.datetime({datetime.datetime.now().year},{datetime.datetime.now().month},{datetime.datetime.now().day},0,0,0)&quot;)\nf.close\n\n<\/pre><\/div>","protected":false},"excerpt":{"rendered":"<p>Several years ago, I started recording our Township meetings and posting them to YouTube. This was very helpful &#8212; even our government officials used the recordings to refresh their memory about what happened in a meeting. But it also led people to ask &#8220;why, exactly, are we relying on some random citizen to provide this &hellip;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1945],"tags":[664,1938,1939],"class_list":["post-10548","post","type-post","status-publish","format-standard","hentry","category-python","tag-python","tag-youtube","tag-youtube-api"],"_links":{"self":[{"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=\/wp\/v2\/posts\/10548","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=10548"}],"version-history":[{"count":3,"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=\/wp\/v2\/posts\/10548\/revisions"}],"predecessor-version":[{"id":10573,"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=\/wp\/v2\/posts\/10548\/revisions\/10573"}],"wp:attachment":[{"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=10548"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=10548"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=10548"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}