YTCrack v0.24b ============== YTCRACK.TXT - Copyright (c) 2013, Fred C. Macall History and Overview YTCrack is a DOS program for extracting "videoplayback" URL(s) from freshly saved copies of YouTube Web pages. That is, Web pages with URLs like: http://www.youtube.com/watch?v=VideoIDCode Beginning with version 0.11b, YTCrack also works with freshly saved copies of video.google.com Web pages. That is, Web pages with URLs like: http://video.google.com/videoplay?docid=-1234567890123456789 YTCrack produces its result(s) in a relatively small .HTM document or file. This document exposes anchor(s) or link(s), to the YouTube or video.google videoplayback URL(s) that YTCrack has extracted from a given YouTube or video.google Web page copy. This YTCrack result document may be used by a Web browser, without scripts nor "Flash" support, for downloading the YouTube or video.google video(s) its anchor(s) identify. Beginning with version 0.12b, YTCrack supports the itag=38 video properties and \uhhhh sequences we've found in YouTube Web's pages, since February or March 2011. (More about these things below.) Beginning with version 0.14b, YTCrack supports the itag= 4x, 8x, and 10x video properties we've found in YouTube's Web pages, since around the middle of 2011. Beginning with version 0.15b, YTCrack supports the itag= 36, 83, and 101 video properties we've found in YouTube's Web pages, since May or June 2012. On or about 13 September 2012, YouTube made some major changes in the videoplayback URLs offered in their /watch?v=VideoIDCode Web pages. In particular, the URLs' established &signature= . . . field seemed to be replaced by a new &sig= . . . field. Also, several other new fields were added. And, the URLs' termination point appeared to move from the end of the &id= . . . field to the end of the apparently new &sig= . . . field. At the same time, YouTube started rejecting the URLs developed from these new Web pages by YTCrack v0.15b and all previous YTCrack versions! After a lot of effort, Glenn McCorkle discovered that &sig= , apparently, is only a shortening or obfuscation of the old &signature= . . . field's name. YTCrack version 0.16b became able to develop acceptable YouTube URLs, again, by changing &sig= to &signature= , in the URLs it extracts from YouTube /watch?v=VideoIDCode Web pages! In the interests of avoiding apparently unnecessary complication(s), YTCrack v0.16b also removes the new &type= . . . and &quality= . . . fields from the URLs it develops. Further, it provides support for the itag=17 video properties we've found in YouTube's Web pages recently. (More about these things below.) On or about 18 December 2012, YouTube made some more big changes in some of their /watch?v=VideoIDCode Web pages. These changes seem to be presented in a semi-random fashion. So that YTCrack v0.16b is still able to extract usable URLs from some of the pages. But, not from others. If our September 2012 experience is any guide, YTCrack v0.16b may be expected to become completely useless within a few months. None of the December 2012 changes can be understood as being anything but obfuscatory! First, videoplayback URL fields are being relocated semi-randomly. For what purpose other than obfuscation? So that no field may be assumed to start with ampersand, consistently. Almost every field is subject to appearing at the beginning of the URLs. Where it will start with question mark. Next, we are seeing /watch?v=VideoIDCode Web pages bearing URLs with apparently invalid sig= . . . fields! Upon deeper inspection, we find that the sig= . . . fields in these pages have been shuffled among the URLs. For what purpose other than obfuscation? From the first URL to the last, each sig= . . . field has been shuffled to the URL just ahead. So that the last URL has no sig= . . . field. And, what was the first URL's sig= . . . field now has to be found in a url_encoded_fmt_stream_map . . . field located just ahead of the first URL! Also, we are finding duplicated itag= . . . fields in the URLs on some of the Web pages. For what purpose other than obfuscation? URLs containing these won't be accepted by YouTube until one of the duplicate itag= . . . fields has been removed. YTCrack version 0.21b contains accommodations and/or new provisions for dealing with each of these new obfuscations. So that it is working with all of the YouTube Web pages encountered, again. Also, it provides support for the itag=85 video properties we've found in a few of YouTube's Web pages recently. (More about these things below.) On or about 1 April 2013, YouTube brought us at least nine "new" itag= . . . values and some more videoplayback URL structure changes. Quite the joker. All of the new itag= . . . values seem to identify .MP4 videos carrying only a single stream, which may be either audio or video. More about these below. Apparently, YouTube further "fuzzed" its videoplayback URL structure by allowing more or all of the 23 sometimes required URL fields to appear at the very end of URL(s). (The 23 sometimes required URL fields we've identified don't include &fallback_host= . . . , &quality= . . . , nor &type= . . . .) When present, the initial URL signature fragment, which always used to precede all videoplayback URLs, now may be found intermingled with the videoplayback URLs. Apparently, reconstituting signatures mode doesn't apply for any videoplayback URL(s) preceding a URL signature fragment. YTCrack version 0.24b supports the nine new itag= . . . values and includes new provisions for dealing with those URL structure changes. (More about these things below.) The YouTube and video.google videos we've seen, so far, are .3GP, .FLV, .MP4, and .WEB types. (We are using .3GP for .3gpp, a video container used in cell 'phones. And, .WEB for .WebM, a relatively new video container type.) Most of these videos contain both audio and video streams. However, since about 1 April 2013, we have seen some audio/mp4 files containing an audio stream, only (AO). And, some video/mp4 files containing a video stream, only (VO). Also, for some time now, some of the .MP4 and .WEB offerings have appeared to contain Side-by-Side SteroScopic 3D (3D) videos. Given a suitable video adapter and fast enough processor, all of these may be viewed/heard with the latest DOS MPLAYER version. So, your whole video receiving and viewing/listening process can be accomplished in DOS without scripts enabled and without Flash support. And, it yields downloaded video file(s) that don't have to be discovered in and retrieved from a cache directory. The video.google Web pages we've seen all carry a single videoplayback URL. However, the YouTube Web pages we've seen, so far, all carry one, two, six, or eight copies of each of three to twenty three unique videoplayback URLs. That is, a total of 3 to 56 videoplayback URLs! (Before March 2011, we always saw eight copies of each of three to six unique videoplayback URLs. Lately, we have been seeing only one copy of each of three to twenty three videoplayback URLs.) So, after it has identified these, YTCrack weeds out any duplicates -- to save you this effort. The unique videoplayback URLs remaining are distinguished and identified by their itag=nnn fields. These itag= values appear to identify each video's resolution and, perhaps, other characteristics. The itag= values we've seen so far are: itag= video resolution/bitrate value type ( w x h ) flags ===== ===== ================== 5 FLV 320 x 240 17 3GP 176 x 144 18 MP4 480 x 360 22 MP4 1280 x 720 34 FLV 480 x 360 35 FLV 640 x 480 36 3GP 320 x 240 37 MP4 1920 x 1080 38 MP4 2048 x 1080 43 WEB 480 x 360 44 WEB 640 x 480 45 WEB 1280 x 720 46 WEB 1920 x 1080 82 MP4 480 x 360 3D 83 MP4 640 x 480 3D 84 MP4 1280 x 720 3D 85 MP4 1920 x 1080 3D 100 WEB 480 x 360 3D 101 WEB 640 x 480 3D 102 WEB 1280 x 720 3D 133 MP4 320 x 240 VO 134 MP4 480 x 360 VO 135 MP4 640 x 480 VO 136 MP4 1280 x 720 VO 137 MP4 1920 x 1080 VO 139 MP4 Low bitrate AO 140 MP4 Med bitrate AO 141 MP4 Hi bitrate AO 160 MP4 256 x 144 VO Notes for table: Each itag= value always identifies the video type indicated above. The itag= values usually, but not quite always, identify the exact video pixels heights indicated. More often, they don't exactly identify the pixels widths indicated. For example: itag=35 always identifies an .FLV type video. That video is quite likely to be 480 pixels high. Its width may be in the range of 640 to 960 pixels. See the discussion several paragraphs above for an introduction to the 3D, AO, and VO flags. YTCrack uses no configuration or helper file or program and is intended to run on 8086/8088 based, and all later, IBM PC compatible PCs. It may need 300 KB to 400 KB of DOS memory, depending on the size of the given YouTube page. But, it requires only DOS v3.x, or later. In much of the rest of this document, we'll describe YTCrack's handling of YouTube and video.google Web pages and videoplayback URLs in common. In the few places where that isn't appropriate, we'll give details for both types. How It Works YTCrack starts off by checking for a pair of command line parameters that do not match each other. Finding anything else, it displays its Usage message and terminates. For the record, this message reads: Usage: YTCRACK Notes: specifies a fresh source copy of an http://www.youtube.com/watch?v= Web page. And, specifies an HTML page to be produced. The two file names must be different. YTCrack then opens both of its given files. And, reads through its input file looking for URLs of the form: httpvideoplayback Where: is any text that doesn't include space character(s), http, nor videoplayback . is any text that doesn't include http . is http or the end-of-file indication. Beginning with YTCrack version 0.21b, this section also looks for fields of the form: url_encoded_fmt_stream_mapsighttpvideoplayback Where: is any text that doesn't include sig, http, nor videoplayback . is any text that doesn't include http nor videoplayback . is any text that doesn't include videoplayback . We use the term "URL signature fragment(s)" to refer to these fields. Their value immediately follows an equal character or a four character ": " sequence immediately following the url_encoded_fmt_stream_map string introduced above. The http string in the above definition locates their initial termination point. Note that the portion of these fields may be found to contain URL field(s), in addition to the sig= . . . field, which follows. When URL signature fragment(s) are found, YTCrack enters a new "reconstituting signature(s)" mode. This new mode changes or extends some YTCrack behaviors, as described below. Since YTCrack version 0.24b, reconstituting signature(s) mode does not apply to any videoplayback URL(s) preceding the URL signature fragment found. YTCrack then looks through each URL signature fragment and videoplayback URL it has found, for %hh sequences and back slash characters. All three character %hh sequences get replaced with the single character that the hex digits hh represent. And, this process gets repeated when %25 gets replaced by % . However, any % not followed by hh gets left as-is. Next, back slash characters are sought. All six character \uhhhh sequences get replaced with the single character that the least significant eight bits of hhhh represents. All remaining isolated back slash characters and one back slash from each pair of them get removed. Finally, each URL's "tail" gets trimmed at the end of the last of the 23 presently known sometimes required &fieldname=value videoplayback URL fields present. Unless this is a URL signature fragment. For these, only &sig= . . . and &signature= . . . field(s) are sought for tail trimming purposes. When $id= . . . is found last, the tail gets trimmed at the first non-hex digit character following. This measure used to handle YouTube videoplayback URLs. The video.google videoplayback URLs we've seen end with &key=ttn , instead of &id=hhhhhhhh . So, if &key= . . . is the last of the 23 fields found, the URL's tail gets trimmed at the first non-decimal digit character following &key=tt . From about 13 September 2012 until about 1 April 2013, all the YouTube videoplayback URLs seemed to end with &sig=hhh . . . hhh.hhh . . . hhh fields. That still may be the case for videoplayback URLs for which reconstituting signature(s) mode applys. Beginning with version 0.16b, YTCrack changes the &sig= part of that to &signature= . And, then, handles the case that &signature= . . . is the last of the 23 fields found. In this case, YTCrack looks for the first non-hex digit character following &signature= . If that is a period, YTCrack again looks for a non-hex digit character following that. The non-hex digit character thus found marks the spot for trimming the URL. The new &quality= . . . field we've been seeing may follow the &sig= . . . field. So, the trimming just described eliminates it. When any of the 23 sometimes required fields other than the four discussed above are found last, the first ampersand or null following that field gets taken as the termination point. Then, when reconstituting signature(s) mode is applicable, any videoplayback URL found lacking a &signature= . . . field is given a temporary appended ampersand. This provides the signature fragment tail starting location for these URL(s), as discussed below. With its batch of URL signature fragment(s) and videoplayback URL(s) unencoded and clarified as explained above, YTCrack looks through them all and eliminates any duplicate(s) present. As stated earlier, there used to be at least two copies of each different or distinct URL present, in YouTube (but not video.google) Web pages. If no duplicate(s) are present, this check has no effect. Next, the sequences quality=& and type=& , where is question mark or ampersand, are sought in each URL signature fragment and full URL processed. For each of these sequences, the first such sequence found, if any, gets reduced to its character. When reconstituting signatures, YTCrack then considers its remaining full URL(s), from last to the first following the URL signature fragment. And, with each of these, the full URL or URL signature fragment just ahead of it. Signature fragment tail starting locations in each full URL are identified by seeking the last field with a name matching the first field named at the beginning of the (first) URL signature fragment. For full URL(s) lacking a signature, the temporary ampersand, added as described above, serves to locate the point where a URL signature fragment tail is needed. Then the earlier signature fragment tail gets copied, in its entirety, into the later full URL. Last of all, while parsing, YTCrack v0.24b looks for fully duplicated itag= . . . fields, within each full URL. That is, duplicate itag=nnn sequences with matching nnn values. Where is question mark or ampersand. is ampersand or null. And, nnn may be any number of decimal digits. When found, the later such duplicate, in each URL, gets removed. YTCrack then prepares a small html document exposing anchors or links containing the distinct videoplayback URL(s) that it has found. For all URL(s) that contain a itag= field, itag= gets used for the anchor's visible text. Where , as usual, is question mark or ampersand. And, is ampersand or null. Also, gets looked for in a list of the twenty nine itag= values given in the table in the History and Overview section. If this lookup succeeds, the implied video file type, resolution, and any appropriate 3D, AO, or VO flag get placed in the visible text, as well. In case an &itag=& field isn't found in a URL, the visible text: videoplayback URL number nn will be used. Where nn indicates the number of this videoplayback URL's first appearance in the given document's sequence of videoplayback URLs. Installation and Operational Considerations PKUNZIP YTCRAC24.ZIP into an empty directory. Copy the resulting files to where you want them. You may as well put YTCRACK.EXE in a directory on your Path. There is nothing to configure. YouTube and video.google seem to tailor their Web pages to their individual users. In particular, these Web pages seem to contain time stamps, which cause their videoplayback URLs to quit working after a matter of days or sooner. Also, the videoplayback URLs seem to contain information that has to match at least part of the downloading user's IP address. Therefore, we suggest that you get a fresh copy of each YouTube or video.google Web page you want to use immediately before running YTCrack and downloading from its result document. YTCrack won't work with saved YouTube Web pages with URLs like: http://www.youtube.com/v/VideoIDCode or http://www.youtube.com/embed/VideoIDCode However, we have found that, for every valid VideoIDCode , there always seems to be a URL available with the required form. So, substitute: /watch?v= for /v/ or /embed/ in the URL you are attempting to use, and resave the page if necessary, when you run into an unusable YouTube URL. Acknowledgements and Copyright Notices YTCrack owes its existence to Glenn McCorkle and Ron Clarke, as well as to its author. First, Glenn informally published his technique for discovering videoplayback URL(s) in YouTube Web pages. Then, in his UNTUBE.EXE program, Ron provided a sample implementation of Glenn's technique. YTCRACK.EXE v0.24b and the libraries, materials, and tools used to make it contain the following Copyright notices: Borland C++ - Copyright 1991 Borland Intl. LZEXE.EXE Version 0.91 (c) 1989 Fabrice BELLARD Disclaimer This software is published in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose. Fred C. Macall K8GIV 1019 Pennfield Road Cleveland Heights, Ohio, 44121, U.S.A. (216) 382-3415 For e-mail contact, run YTCrack . http://users.ohiohills.com/fmacall/ 24 April 2013