Toutiao video and watermelon video signature algorithm

Click above blue word [ protocol analysis and restoration ] to follow us

" Analyze the signature algorithm used by the browser to open the small video and Xigua video in Toutiao after sharing. "

I wrote an article about using WeChat wxid to add friends. I accidentally ran into a WeChat search black hole. It became the article with the largest search volume, reading volume, and attention in the history of this account. It seems that the audience is pure dry goods With limited coverage, information with high attention can bring greater gains.

In order to better meet everyone's needs, you can send "wxid" to get a screenshot of the specific method of using wxid to add friends on WeChat. After all, I can't reply to every request, and I can't teach you personally. You should solve the problem by self-help .

The following is this article, friends in need can click in to learn:

Today's article analyzes the signature algorithm of Toutiao. It is quite dry. The knowledge in it is widely used in the technical applications related to Toutiao. I hope you can learn it happily.

Today’s Toutiao is a popular fried chicken, and many people are staring at it. Of course, crawling various types of data in Toutiao is essential. There are many articles on the Internet that analyze Toutiao’s signature algorithm, but they are not enjoyable. Some problems are not clear. However, there are some problems in practice. Here, a more valuable description of the signature signature will be given.

In addition, I have to say that js is the most garbage language in the world, and I hope to withdraw from the stage of history as soon as possible.

01

What is a signature?

In Toutiao, the server will verify the data of some key nodes. Part of the verification is to sign the data on the client and check the validity of the data on the server. legitimate request.

For example, in the process of accessing a video shared by Toutiao, the browser will automatically request the following URL:

https://m.365yg.com/i6714101379172027656/info/?_signature=lyZkrxARynO8Pdz-QfDCYJcmZL&i=6714101379172027656

In this case, _signature does not exist in the messages before and after it. It is calculated by the js sent by the Toutiao server based on some parameters, that is, the signature.

If this _signature is wrong, the server will find out, and will not return the message we really need.

The above URL is used to request some key information of the video. If the _signature is wrong, it will return a Toutiao’s own default video.

Therefore, signature is a verification value generated by the client according to the js algorithm issued by Toutiao, that is, the signature. The server will verify it and give different feedbacks according to the verification situation.

02

signature algorithm

Knowing what is going on with Toutiao’s signature signature, everyone will definitely think that since the signature value is generated by js on the client side, and js can be easily obtained by us, and the js code cannot be truly encrypted, then the signature value Isn't it easy to figure it out?

I thought so too at the beginning. When I saw the js algorithm that generates the signature signature, I was discouraged. What the hell is this? ! After obfuscation, there are garbled characters everywhere, no format, and a bunch of inexplicable strings.

Then let's analyze it bit by bit.

First of all, this signature algorithm was reversed a long time ago. You can find a lot of it by searching "Toutiao signature" on the Internet. After confirmation, this algorithm is used in the Toutiao video and Xigua video.

Here is an introduction to the analysis process of Toutiao and Xigua videos for better use.

Using fiddler to capture packets, it is easy to capture various js files of Toutiao. Among them, the signature algorithm is in the file xigua_video.38p5Q8hF.js. The following is the signature part after formatting the file:

640?wx_fmt=png

It looks very broken, there are a bunch of garbled codes inside, this is the so-called feature of js, except for creating unnecessary obstacles, it has no effect, we must have confidence, as long as there is an environment that can run js, js must run naked, and eventually it will be Reverting to the code that the browser can recognize is indeed the case.

To untie it step by step, try to untie the previous part first, the following is the code, and the result will be output directly:

var x =function(l){return'e(e,a,r){(b[e]||(b[e]=t("x,y","x "+e+" y")(r,a)}a(e,a,r){(k[r]||(k[r]=t("x,y","new x[y]("+Array(r+1).join(",x[y]")(1)+")")(e,a)}r(e,a,r){n,t,s={},b=s.d=r?r.d+1:0;for(s["$"+b]=s,t=0;t<b;t)s[n="$"+t]=r[n];for(t=0,b=s=a;t<b;t)s[t]=a[t];c(e,0,s)}c(t,b,k){u(e){v[x]=e}f{g=,ting(bg)}l{try{y=c(t,b,k)}catch(e){h=e,y=l}}for(h,y,d,g,v=[],x=0;;)switch(g=){case 1:u(!)4:f5:u((e){a=0,r=e;{c=a<r;c&&u(e[a]),c}}(6:y=,u((y8:if(g=,lg,g=,y===c)b+=g;else if(y!==l)y9:c10:u(s(11:y=,u(+y)12:for(y=f,d=[],g=0;g<y;g)d[g]=y.charCodeAt(g)^g+y;u(String.fromCharCode.apply(null,d13:y=,h=delete [y]14:59:u((g=)?(y=x,v.slice(x-=g,y:[])61:u([])62:g=,k[0]=65599*k[0]+k[1].charCodeAt(g)>>>065:h=,y=,[y]=h66:u(e(t[b],,67:y=,d=,u((g=).x===c?r(g.y,y,k):g.apply(d,y68:u(e((g=t[b])<"<"?(b--,f):g+g,,70:u(!1)71:n72:+f73:u(parseInt(f,3675:if(){bcase 74:g=<<16>>16g76:u(k[])77:y=,u([y])78:g=,u(a(v,x-=g+1,g79:g=,u(k["$"+g])81:h=,[f]=h82:u([f])83:h=,k[]=h84:!085:void 086:u(v[x-1])88:h=,y=,h,y89:u({e{r(e.y,arguments,k)}e.y=f,e.x=c,e})90:null91:h93:h=0:;default:u((g<<16>>16)-16)}}n=this,t=n.Function,s=Object.keys||(e){a={},r=0;for(c in e)a[r]=c;a=r,a},b={},k={};r'.replace(/[-]/g,function(e){return l[15&e.charCodeAt(0)]})}("v[x++]=v[--x]t.charCodeAt(b++)-32function return ))++.substrvar .length(),b+=;break;case ;break}".split(""));	
console.log(x);

The result is as follows:

function e(e,a,r){return (b[e]||(b[e]=t("x,y","return x "+e+" y")))(r,a)}function a(e,a,r){return (k[r]||(k[r]=t("x,y","return new x[y]("+Array(r+1).join(",x[++y]").substr(1)+")")))(e,a)}function r(e,a,r){var n,t,s={},b=s.d=r?r.d+1:0;for(s["$"+b]=s,t=0;t<b;t++)s[n="$"+t]=r[n];for(t=0,b=s.length=a.length;t<b;t++)s[t]=a[t];return c(e,0,s)}function c(t,b,k){function u(e){v[x++]=e}function f(){return g=t.charCodeAt(b++)-32,t.substring(b,b+=g)}function l(){try{y=c(t,b,k)}catch(e){h=e,y=l}}for(var h,y,d,g,v=[],x=0;;)switch(g=t.charCodeAt(b++)-32){case 1:u(!v[--x]);break;case 4:v[x++]=f();break;case 5:u(function (e){var a=0,r=e.length;return function (){var c=a<r;return c&&u(e[a++]),c}}(v[--x]));break;case 6:y=v[--x],u(v[--x](y));break;case 8:if(g=t.charCodeAt(b++)-32,l(),b+=g,g=t.charCodeAt(b++)-32,y===c)b+=g;else if(y!==l)return y;break;case 9:v[x++]=c;break;case 10:u(s(v[--x]));break;case 11:y=v[--x],u(v[--x]+y);break;case 12:for(y=f(),d=[],g=0;g<y.length;g++)d[g]=y.charCodeAt(g)^g+y.length;u(String.fromCharCode.apply(null,d));break;case 13:y=v[--x],h=delete v[--x][y];break;case 14:v[x++]=t.charCodeAt(b++)-32;break;case 59:u((g=t.charCodeAt(b++)-32)?(y=x,v.slice(x-=g,y)):[]);break;case 61:u(v[--x][t.charCodeAt(b++)-32]);break;case 62:g=v[--x],k[0]=65599*k[0]+k[1].charCodeAt(g)>>>0;break;case 65:h=v[--x],y=v[--x],v[--x][y]=h;break;case 66:u(e(t[b++],v[--x],v[--x]));break;case 67:y=v[--x],d=v[--x],u((g=v[--x]).x===c?r(g.y,y,k):g.apply(d,y));break;case 68:u(e((g=t[b++])<"<"?(b--,f()):g+g,v[--x],v[--x]));break;case 70:u(!1);break;case 71:v[x++]=n;break;case 72:v[x++]=+f();break;case 73:u(parseInt(f(),36));break;case 75:if(v[--x]){b++;break}case 74:g=t.charCodeAt(b++)-32<<16>>16,b+=g;break;case 76:u(k[t.charCodeAt(b++)-32]);break;case 77:y=v[--x],u(v[--x][y]);break;case 78:g=t.charCodeAt(b++)-32,u(a(v,x-=g+1,g));break;case 79:g=t.charCodeAt(b++)-32,u(k["$"+g]);break;case 81:h=v[--x],v[--x][f()]=h;break;case 82:u(v[--x][f()]);break;case 83:h=v[--x],k[t.charCodeAt(b++)-32]=h;break;case 84:v[x++]=!0;break;case 85:v[x++]=void 0;break;case 86:u(v[x-1]);break;case 88:h=v[--x],y=v[--x],v[x++]=h,v[x++]=y;break;case 89:u(function (){function e(){return r(e.y,arguments,k)}return e.y=f(),e.x=c,e}());break;case 90:v[x++]=null;break;case 91:v[x++]=h;break;case 93:h=v[--x];break;case 0:return v[--x];default:u((g<<16>>16)-16)}}var n=this,t=n.Function,s=Object.keys||function (e){var a={},r=0;for(var c in e)a[r++]=c;return a.length=r,a},b={},k={};return r

Find a website on the Internet and format it, and you will find that it is basically consistent with a part of get_as_cp_signature() of the cracked Toutiao signature code that is publicly available on the Internet.

In the get_as_cp_signature function, we only need the TAC.sign part, so the generation of the two values ​​of as and cp can be directly deleted to simplify the code.

The part after deleting the two values ​​of as and cp is the encryption algorithm used to generate the signature, and the material for generating the encryption algorithm is the following content:

("v[x++]=v[--x]t.charCodeAt(b++)-32function return ))++.substrvar .length(),b+=;break;case ;break}".split("")))()('gr$Daten 袠b/s!l y蛼y墓g,(lfi~ah`{mv,-n|jqewVxp{rvmmx,&effkx[!cs"l".Pq%widthl"@q&heightl"vr*getContextx$"2d[!cs#l#,*;?|u.|uc{uq$fontl#vr(fillTextx$$榫樴笐喔犼步2<[#c}l#2q*shadowBlurl#1q-shadowOffsetXl#$$limeq+shadowColorl#vr#arcx88802[%c}l#vr&strokex[ c}l"v,)}eOmyoZB]mx[ cs!0s$l$Pb<k7l l!r&lengthb%^l$1+s$jl  s#i$1ek1s$gr#tack4)zgr#tac$! +0o![#cj?o ]!l$b%s"o ]!l"l$b*b^0d#>>>s!0s%yA0s"l"l!r&lengthb<k+l"^l"1+s"jl  s&l&z0l!$ +["cs\'(0l#i\'1ps9wxb&s() &{s)/s(gr&Stringr,fromCharCodes)0s*yWl ._b&s o!])l l Jb<k$.aj;l .Tb<k$.gj/l .^b<k&i"-4j!+& s+yPo!]+s!l!l Hd>&l!l Bd>&+l!l <d>&+l!l 6d>&+l!l &+ s,y=o!o!]/q"13o!l q"10o!],l 2d>& s.{s-yMo!o!]0q"13o!]*Ld<l 4d#>>>b|s!o!l q"10o!],l!& s/yIo!o!].q"13o!],o!]*Jd<l 6d#>>>b|&o!]+l &+ s0l-l!&l-l!i\'1z141z4b/@d<l"b|&+l-l(l!b^&+l-l&zl\'g,)gk}ejo{cm,)|yn~Lij~em["cl$b%@d<l&zl\'l $ +["cl$b%b|&+l-l%8d<@b|l!b^&+ q$sign ', [Object.defineProperty(r, "__esModule", {	
value: !0})])};

It is garbled again, and the get_as_cp_signature() algorithm published on the Internet converts it into url encoding:

r(decodeURIComponent("gr%24Daten%20%D0%98b%2Fs!l%20y%CD%92y%C4%B9g%2C(lfi~ah%60%7Bmv%2C-n%7CjqewVxp%7Brvmmx%2C%26eff%7Fkx%5B!cs%22l%22.Pq%25widthl%22%40q%26heightl%22vr*getContextx%24%222d%5B!cs%23l%23%2C*%3B%3F%7Cu.%7Cuc%7Buq%24fontl%23vr(fillTextx%24%24%E9%BE%98%E0%B8%91%E0%B8%A0%EA%B2%BD2%3C%5B%23c%7Dl%232q*shadowBlurl%231q-shadowOffsetXl%23%24%24limeq%2BshadowColorl%23vr%23arcx88802%5B%25c%7Dl%23vr%26strokex%5B%20c%7Dl%22v%2C)%7DeOmyoZB%5Dmx%5B%20cs!0s%24l%24Pb%3Ck7l%20l!r%26lengthb%25%5El%241%2Bs%24j%02l%20%20s%23i%241ek1s%24gr%23tack4)zgr%23tac%24!%20%2B0o!%5B%23cj%3Fo%20%5D!l%24b%25s%22o%20%5D!l%22l%24b*b%5E0d%23%3E%3E%3Es!0s%25yA0s%22l%22l!r%26lengthb%3Ck%2Bl%22%5El%221%2Bs%22j%05l%20%20s%26l%26z0l!%24%20%2B%5B%22cs'(0l%23i'1ps9wxb%26s()%20%26%7Bs)%2Fs(gr%26Stringr%2CfromCharCodes)0s*yWl%20._b%26s%20o!%5D)l%20l%20Jb%3Ck%24.aj%3Bl%20.Tb%3Ck%24.gj%2Fl%20.%5Eb%3Ck%26i%22-4j!%1F%2B%26%20s%2ByPo!%5D%2Bs!l!l%20Hd%3E%26l!l%20Bd%3E%26%2Bl!l%20%3Cd%3E%26%2Bl!l%206d%3E%26%2Bl!l%20%26%2B%20s%2Cy%3Do!o!%5D%2Fq%2213o!l%20q%2210o!%5D%2Cl%202d%3E%26%20s.%7Bs-yMo!o!%5D0q%2213o!%5D*Ld%3Cl%204d%23%3E%3E%3Eb%7Cs!o!l%20q%2210o!%5D%2Cl!%26%20s%2FyIo!o!%5D.q%2213o!%5D%2Co!%5D*Jd%3Cl%206d%23%3E%3E%3Eb%7C%26o!%5D%2Bl%20%26%2B%20s0l-l!%26l-l!i'1z141z4b%2F%40d%3Cl%22b%7C%26%2Bl-l(l!b%5E%26%2Bl-l%26zl'g%2C)gk%7Dejo%7B%7Fcm%2C)%7Cyn~Lij~em%5B%22cl%24b%25%40d%3Cl%26zl'l%20%24%20%2B%5B%22cl%24b%25b%7C%26%2Bl-l%258d%3C%40b%7Cl!b%5E%26%2B%20q%24sign%20"), [TAC = {}]);

To put it simply, the signature algorithm has remained unchanged, and the name is TAC.sign.

Continue to search again, find the signature, and know that the signed value, that is, the input parameter is the i in the URL that appeared earlier, that is, the video id value:

var _signature = (0, TAC.sign)("6714101379172027656");

However, it is a pity that the signature value generated in this way cannot be used effectively when constructing the request. Friends who visit this article probably have this problem. The first one is dry goods, but there is nothing new, and the latter one is exclusive. Yes, it can solve the problem that the signature value is invalid when constructing the request.

The signature is invalid because the signature algorithm TAC.sign is related to many variables.

03

Signature Algorithm Analysis

The entire signature algorithm is a process of calculating sdbmhash hashes for various data. The content of the hash is different, of course, the result is different.

First of all, there is no doubt that the signature value is related to the input parameter. In the request to access Toutiao Video and Watermelon Video, this input parameter is the i value in the url, that is, the video id value.

Secondly, in the request to visit Toutiao Video and Xigua Video, the signature signature is related to a tac value, which exists in the page that initiated the url request in the previous article:

<script data-from="toutiao">tac='i)69gg22apbs!i$13zns"0,<8~z|\x7f@QGNCJF[\\^D\\KFYSk~^WSZhg,(lfi~ah`{md"inb|1d<,%Dscafgd"in,8[xtm}nLzNEGQMKAdGG^NTY\x1ckgd"inb<b|1d<g,&TboLr{m,(\x02)!jx-2n&vr$testxg,%@tug{mn ,%vrfkbm[!cb|'</script>

Of course, it is also garbled, so don’t worry about it, just use it directly.

Again, the signature signature is also related to the browser userAgent, that is, the UA of the browser that initiates the request. Of course, the UA that can appear on the mobile phone must be used to forge the mobile phone request. This cracked code header that is publicly available on the Internet also appears:

navigator1 = {	
    userAgent: "Mozilla/5.0 (Linux; Android 8.8.2; xxxxx Build/g) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.109 Mobile Safari/537.36"	
}, window = this, window.navigator = navigator1;

Finally, the signature signature is also related to the image value created by a garbled word in the second paragraph of garbled characters. According to the pits of debugging, the image is likely to be different on the PC and the mobile phone. On the PC, the garbled characters are generated The picture is like this:

640?wx_fmt=jpeg

The sdbmhash hash of this thing is not equal to the hash in the signature value of the message generated on the mobile phone. I had to use a brute force method to blast out that the hash on the mobile phone I used for debugging is: 723236945, while the PC The hash generated by this image in the above algorithm is: 3311753357.

After controlling these four contents, the correct signature value can be generated. Finally, a 26-byte signature value can be divided into 5 parts according to the debugging process:

j5qI5 xAY0r xK.9Bq 6f.DU 4-aiP	
5a   |5b   |6c    |5d   |5e	
6c与构造的乱码字图像有关	
5d与浏览器UA有关	
tac与所有的内容有关	
视频id值与5d有关 

04

Welfare

As mentioned earlier, four values ​​must be controlled to generate available signature values. Among them, the video id is an input parameter, and the tac value can be directly assigned in the code. Both of these are easy to control, but the UA value is indeed from js. The images of garbled characters that are automatically acquired by the current operating environment are also generated by some js system functions, and the generation process cannot be controlled, so it is not easy to replace.

Here is the replacement code, the first is the replacement of UA, in case 77 of get_as_cp_signature:

  case 77:	
    y = v[--x];	
    p00 = v[--x];	
    if("navigator"==y)	
    {	
        p = navigator1;	
    }	
    else	
    {	
        p = p00[y];	
    }	
    u(p);	
    // y = v[--x],	
    // u(v[--x][y]);	
    break;

The second is the replacement of the hash value of the garbled word image, in case 62 of get_as_cp_signature:

case 62:	
   g = v[--x],	
   k[0] = 65599 * k[0] + k[1].charCodeAt(g) >>> 0;	
   if(k[0]==3311753357)	
      k[0]=723236945;	
   break;

This is all hard-earned code step by step.

I wish you all a happy work.

Remember to pay attention to me, you give me motivation, and I will give you the dry goods you want.

In addition, the important thing is, if the article is valuable, click "Looking" at the bottom right and share it with your friends↘


 

Guess you like

Origin blog.csdn.net/yeyiqun/article/details/99311257