
I think I've gotten a slightly better understanding of what Youtube is up to with this enciphered signature nonsense, and I'm trying a new method of dealing with it.
If you send me the errors printed for any videos that it can't download, that will be very helpful.
I think that what's going on is not that the ciphers are keyed off of the length of the signature, but rather, than they are just periodically changing the cipher algorithm, so the only way to know what algorithm to use is to have hardcoded knowledge of what is implemented in whatever version of "html5player.js" is getting loaded today (currently "html5player-vfl_ymO4Z.js".)
This means that every time they change the algorithm, I'll have to update the code in youtubedown. I don't know how frequently they're doing that, but that's some bullshit.
Maybe there's a way to parse this out from the Javascript, but since they've obfuscated and minimized it, the name of the decipherment routine changes.
I still have no idea how the signatures in get_video_info are to be deciphered. If there's a clue in there as to what algorithm is in use, I haven't spotted it.
Thanks for the update -- it's working great here. (Sorry if I sounded ungrateful in the previous posts; I was just trying to be terse and helpful in a "report the bug" kind of way.)
Your efforts are definitely appreciated. :)
Thanks again!
I reported a failure a couple of days ago, and it's now working. Thanks!
Now works with https://www.youtube.com/watch?v=sXQVicNodMw
Great work, thanks a lot !
Are you still using get_video_info to get download links? Could you please explain more
Only some videos use ciphers. First I try get_video_info, but if the signature is enciphered, I scrape the HTML instead. This video has no cipher, so get_video_info works:
This video is enciphered, so I have to scrape the HTML:
And this one is a real problem, because it is both enciphered and marked as age-restricted, so currently my script can't download it at all. In that case, the download URLs do not exist in the HTML but only in get_video_info -- but the ones in get_video_info don't work because of the encipherment:
Have you looked at ClickToPlugin's youtube killer before? It's a Safari extension which does the a lot of the same things as youtubedown, except (a) it's multi-site and (b) it's centered around presenting videos in-browser in a HTML5 player rather than the site's own Flash based player. Here's the Youtube-specific part of the source:
https://github.com/hoyois/clicktoplugin/blob/master/ClickToPlugin.safariextension/killers/YouTube.js
I haven't taken the time to understand his implementation of ciphered signatures, but at first glance it appears to be algorithmic rather than a table as in youtubedown, and it works on every enciphered video URL from the youtubedown source that I've tried. Might be worth checking out.
(Sadly, he doesn't appear to believe in comments, whether in code or revision control logs. At least the code looks fairly clean.)
Wow, he is actually parsing out the algorithm by doing a regexp match on the Javascript source. That's some mad science.
This specific video downloads just fine with the latest youtube-dl, in case that's of any use.
If you have a video that works with youtube-dl but not youtubedown, send me the link and error message.
Well, the second video you mentioned (https://www.youtube.com/watch?v=7wL9NUZRZ4I), which you say youtubedown can't get, is successfully retrieved by youtube-dl:
[youtube] Setting language
[youtube] 7wL9NUZRZ4I: Downloading video webpage
[youtube] 7wL9NUZRZ4I: Downloading video info webpage
[youtube] 7wL9NUZRZ4I: Extracting video information
[youtube] 7wL9NUZRZ4I: Encrypted signatures detected.
As has become my habit, I just want to say thanks for this. You do a lot of great, hair-pulling work to make this thing go, and it's very much appreciated.
I used your mixtapes for another party last weekend! Last time I just featured the audio. This time, I hooked up my biggest screen and aired the videos too! I went through fifteen or sixteen of the mixtapes and compiled my favorite five hours of video.
People loved it! I got a lot of compliments about how cool the videos were, and several requests for my playlist.
I also declared a few new favorite bands (M83, Niki & the Dove).
Thanks!
Awesome!
I eagerly await the version of your script that implements a JavaScript interpreter in Perl in order to evaluate Google's cipher.
http://search.cpan.org/~jesse/WWW-Mechanize-1.72/lib/WWW/Mechanize/FAQ.pod#Which_modules_work_like_Mechanize_and_have_JavaScript_support?
Gr8. Thank U.
I was intriged by the idea of finding the cipher so I hacked this together.
It's not as fancy as the Safari plugin and requires mucking up the source, but it does find the cipher function by executing all functions found in html5player.js. I think as long as they use something like call or bind and don't wrap the result in some obscure way, it should be able to find the cipher routine no matter what the obfuscator does.
Relevant javascript can be found at the bottom of the source for the so inclined.
Well that's maniacal.
Something else that would be handy would be to come up with a list of all recent X in "http://s.ytimg.com/yts/jsbin/html5player-X.js". From googling, I've only found: vflNzKG7n, vfllMCQWM, vflJv8FA8, vflR_cX32, vflveGye9, vflj7Fxxt, vfltM3odl, vflmOfVEX, vflJwJuHJ, vflDG7-a-, vfl39KBj1 and vfl_ymO4Z.
Thank you!
Not entirely historical, but gets a list of html5player*.js's with the most recent being in slot 0. http://cdn.snoj.us/miscfiles/yt-html5player.php?sauce
See also http://cdn.snoj.us/miscfiles/yt-html5player.php?get-cipher which makes use of nodejs to do it headless.
I'm sure something similar could be ported to perl.
Is the --title option working ? I'm too noob in perl to figure out what's wrong, but when I try this command line with latest version, it doesn't... which is bad as I don't like Korean characters :D
It does get suffix though
perl youtubedown.pl 'https://www.youtube.com/watch?v=RzSAO8op0ow' --title "55" --suffix
youtubedown.pl: downloading "[MV] MYNAME(마이네임) _ Baby im sorry"
youtubedown.pl: wrote "/temp/[MV] MYNAME(마이네임) _ Baby im sorry [RzSAO8op0ow].mp4", 326M, 1920 x 1080
--title has to go before the URL, because otherwise things like --title T1 URL1 --title T2 URL2 don't make sense.
> wget http://www.jwz.org/hacks/youtubedown
HTTP request sent, awaiting response... 403 Forbidden
Christ. The Internet is doomed when even jwz fucks up basic shit like this.
Go fuck yourself, Anonymous Coward "gah@gah.com" from 173.174.47.125. It's too much 'puter for you.
Also, Herndon VA is a wasteland.