JavaScript逆向之七麦数据实战

徐培源 發表於 2024-3-18 14:06:00

<h1 id="知识点">知识点</h1>
<h2 id="promise对象">Promise对象</h2>
Promise对象是ES6版本中提供的，主要是为了解决死亡回调的问题。 
先看一段代码：
<details>
<summary>点击查看代码</summary>
<pre><code>function fn() {
let username = "alex";
let password = "123456";

// 发送请求给服务器要求登录
console.log("发送请求出去，尝试登录");
setTimeout(function () {
 console.log("服务器返回了一个结果");
 let result_1 = true;
 if(result_1===true){ //登录成功
 // 加载菜单信息
 console.log("准备加载菜单信息");
 setTimeout(function () {
 console.log("显示菜单的信息");
 // 加载用户信息
 console.log("准备加载用户信息");
 setTimeout(function () {
 console.log("显示用户信息");
 },1000)
 }, 1000);
 }
}, 1000);
}
</code></pre>
</details>
该代码是登录网站后网站一步步显示信息的一个demo，可以看到里面存在很多的嵌套，如果想要解决多层嵌套的问题，就可以采用Promise对象，看如下demo：
<details>
<summary>点击查看代码</summary>
<pre><code>function send(url) {
// promise：确保，保证
// reslove：解决了
// reject：拒绝

return new Promise(function (resolve, reject) {
 console.log("帮你发送一个请求到", url); // 答应你的一件事
 let result = 123;
 if (result) {
 // 这件事我办成了
 // 接下来你要做的事应该是调用这个函数的那个人去写
 resolve(i); //这里代表当前任务被解决
 } else {
 // 这件事没办成
 reject(i); // 这里代表当前任务没解决
 }
});
}

function fn() {
let username = "";
let password = "";

//发送请求到登录
send("xxxxx").then(function (data) {
 console.log("登录的结果");
 console.log("登录返回的结果是",data);
 return send("加载菜单");
}).then(function (data) {
 // 加载菜单
 console.log("加载菜单得到的信息",data);
 return send("加载个人信息")
}).then(function (data) {
 // 加载个人信息
 console.log("加载个人信息得到的信息", data);
})
}
</code></pre>
</details>
<code>send</code>函数中会返回一个Promise对象，如果成功了就会执行<code>resolve</code>对应的函数，失败了就执行<code>reject</code>对应的函数。在<code>fn</code>函数中省略了<code>reject</code>对应的函数，因为一般会在Promise对象的最后加一个catch，只要失败了，就直接走catch中的函数，把整个代码抽象一下如下：
<details>
<summary>点击查看代码</summary>
<pre><code>//链式逻辑
new Promise(function (a, b) {}).then(function () {
return new Promise();
},function () {

}).then(function () {
return new Promise();
}).then(function () {

}).catch(function () {
console.log("程序出错，请联系管理员....")
})
</code></pre>
</details>
<h2 id="axios拦截器">axios拦截器</h2>
axios是一个基于Promise的网络请求库，网站如果采用的是axios方法，那么加密和解密的逻辑大概率存在于axios拦截器中。 
axios拦截器分为请求拦截器和响应拦截器，加密逻辑大概率在请求拦截器中，解密逻辑大概率在响应拦截器中，下面看axios拦截器使用的代码：
<details>
<summary>点击查看代码</summary>
<pre><code>// 请求的拦截器
axios.interceptors.request.use(function (config) {
console.log(config, "你好啊");
// 尝试修改请求参数
config.data['hehe'] = "i love you";
return config;
}, function (err) {
console.log(err);
});

// 响应的拦截器
axios.interceptors.response.use(function (response) {
console.log(response);
// 这里一般会有什么??? 解密操作

return response.data; // 拦截器返回的东西直接给到then中的函数
}, function (err) {
console.log(err);
});
</code></pre>
</details>
上面两个知识点讲完，就该进入正篇了。
<h1 id="七麦数据实战">七麦数据实战</h1>
url：https://www.qimai.cn/rank 
滑动页面，抓包，老样子还是看<code>Fetch/XHR</code>类型的。 
<img src="https://img2024.cnblogs.com/blog/3369335/202403/3369335-20240318093247084-1579460327.png"> 
有三个数据包，样式都一样，看下它们的请求参数和响应数据。 
<img src="https://img2024.cnblogs.com/blog/3369335/202403/3369335-20240318093809805-1007902101.png"> 
<img src="https://img2024.cnblogs.com/blog/3369335/202403/3369335-20240318093359402-54035031.png">
<img src="https://img2024.cnblogs.com/blog/3369335/202403/3369335-20240318093821327-314545434.png"> 
<img src="https://img2024.cnblogs.com/blog/3369335/202403/3369335-20240318093429162-1168486001.png">
<img src="https://img2024.cnblogs.com/blog/3369335/202403/3369335-20240318093832495-1708097969.png"> 
<img src="https://img2024.cnblogs.com/blog/3369335/202403/3369335-20240318093500242-2107588870.png"> 
这样子就知道<code>0</code>对应的是付费榜，<code>1</code>对应的是免费榜，<code>2</code>对应的是畅销榜。既然三个请求参数都一样，那就以其中一个为例即可。 
请求头中就一个<code>analysis</code>参数的值是加密的，那目标就是知道该参数的值如何加密的。 
按照惯例，搜索url。 
<img src="https://img2024.cnblogs.com/blog/3369335/202403/3369335-20240318094247631-1101051010.png"> 
<img src="https://img2024.cnblogs.com/blog/3369335/202403/3369335-20240318094303076-1107223312.png"> 
总共三处地方，但这三处全是赋值操作，没有其他的代码，那么搜索url就失效了，接下来搜索<code>analysis</code>关键词。 
<img src="https://img2024.cnblogs.com/blog/3369335/202403/3369335-20240318094507803-630080743.png"> 
三处地方，但<code>analysis</code>都位于url地址中，根本不可能是给<code>analysis</code>参数赋值的，所以这也失效了，最后只能通过Initiator来找了。 
<img src="https://img2024.cnblogs.com/blog/3369335/202403/3369335-20240318094703147-213403977.png"> 
明显的看到了Promise对象，就可以联想到axios拦截器了，搜索<code>interceptors</code>。 
<img src="https://img2024.cnblogs.com/blog/3369335/202403/3369335-20240318095010377-1752517783.png"> 
也是三处，第一处是个赋值，不可能是加密逻辑，看下第二处和第三处整个的逻辑。
<details>
<summary>点击查看代码</summary>
<pre><code>l.prototype.request = function(e) {
 "string" == typeof e ? (e = arguments || {}).url = arguments : e = e || {},
 (e = s(this.defaults, e)).method ? e.method = e.method.toLowerCase() : this.defaults.method ? e.method = this.defaults.method.toLowerCase() : e.method = "get";
 var t =
 , n = Promise.resolve(e);
 for (this.interceptors.request.forEach((function(e) {
 t.unshift(e.fulfilled, e.rejected)
 }
 )),
 this.interceptors.response.forEach((function(e) {
 t.push(e.fulfilled, e.rejected)
 }
 )); t.length; )
 n = n.then(t.shift(), t.shift());
 return n
 }
</code></pre>
</details>
先对<code>e</code>进行类型判断和值的重新赋值，然后声明<code>t</code>为数组和<code>n</code>为Promise对象，接着两个for循环，请求拦截器中遍历往<code>t</code>数组的头部插入元素，响应拦截器遍历往<code>t</code>数组的尾部插入元素，可以看到遍历完成后，<code>t</code>数组中总共有6个对象，最后从<code>t</code>数组的头部弹出两个元素交给Promise对象的then函数执行。 
<img src="https://img2024.cnblogs.com/blog/3369335/202403/3369335-20240318101004821-41060679.png"> 
根据Promise对象的then函数可以知道，会给其传两个参数，成功了执行第一个参数，失败了执行第二个参数。所以如果这里存在加密逻辑的话，那么一定在<code>t</code>数组的第一个参数处，定位。 
<img src="https://img2024.cnblogs.com/blog/3369335/202403/3369335-20240318101559471-123203974.png"> 
<img src="https://img2024.cnblogs.com/blog/3369335/202403/3369335-20240318101721570-1717398774.png"> 
从以下三个变量的值也可以看出没找错地方。 
<img src="https://img2024.cnblogs.com/blog/3369335/202403/3369335-20240318102216616-2133276278.png"> 
这段代码中存在非常多的花指令，得先将其还原，打断点进行调试，还原出来的代码如下。（catch中函数就不用管了）
<details>
<summary>点击查看代码</summary>
<pre><code>function fn(t) {
var n;
n = i["ej"]("synct"),
 s = c["default"]["prototype"]["difftime"] = -i["ej"]("syncd") || +new z["Date"] - 1000 * n;
var e, r = +new z["Date"] - (s || 0) - 1661224081041, a = [];
return void 0 === t["params"] && (t["params"] = {}),
 z["Object"]["keys"](t["params"])["forEach"](function (n) {
 if (n == "analysis")
 return !1;
 t["params"]["hasOwnProperty"](n) && a["push"](t["params"])
 }),
 a = a["sort"]()["join"](""),
 a = i["cv"](a),
 a = (a += "@#" + t["url"]["replace"](t["baseURL"], "")) + ("@#" + r) + ("@#" + 3),
 e = i["cv"](i["oZ"](a, "xyz517cda96efgh")),
-B == t["url"]["indexOf"]("analysis") && (t["url"] += (-B != t["url"]["indexOf"]("?") ? "&" : "?") + "analysis" + "=" + z["encodeURIComponent"](e)),
 t
}
</code></pre>
</details>
接下来分析这段代码。 
<code>n = i["ej"]("synct")</code>用于获取cookie中<code>synct</code>的值。 
<img src="https://img2024.cnblogs.com/blog/3369335/202403/3369335-20240318104127563-626254237.png"> 
<code>s = c["default"]["prototype"]["difftime"] = -i["ej"]("syncd") || +new z["Date"] - 1000 * n;</code>用于获取cookie中<code>syncd</code>中的值，如果cookie中没有<code>syncd</code>，则<code>s=new Date()-1000*n</code>。 
<img src="https://img2024.cnblogs.com/blog/3369335/202403/3369335-20240318104521612-965415977.png"> 
<code>r = +new z["Date"] - (s || 0) - 1661224081041</code>就是计算一个时间差，这个值不是固定的，所以我们可以直接把<code>s</code>的值固定，上面两行代码就没用了。 
<code>void 0 === t && (t = {})</code>就是false。 
<img src="https://img2024.cnblogs.com/blog/3369335/202403/3369335-20240318104945446-906324797.png"> 
<code>z["Object"]["keys"](t["params"])["forEach"](function (n) { if (n == "analysis") return !1; t["params"]["hasOwnProperty"](n) && a["push"](t["params"]) })</code>遍历<code>t["params"]</code>中的所有键，将对应的值全部存放到<code>a</code>数组中，<code>z</code>是window对象，故<code>z["Object"]["keys"]</code>等同于<code>Object["keys"]</code>。 
<code>a = a["sort"]()["join"]("")</code>对a数组进行排序，并用空字符串连接。 
<code>a = i["cv"](a)</code>需要知道<code>i["cv"]</code>是什么，等下直接把源代码复制进来即可。 
<code>a = (a += "@#" + t["url"]["replace"](t["baseURL"], "")) + ("@#" + r) + ("@#" + 3)</code>对<code>a</code>的值进行拼接。 
<code>e = i["cv"](i["oZ"](a, "xyz517cda96efgh"))</code>同理，直接复制源代码。 
<code>-B == t["url"]["indexOf"]("analysis") && (t["url"] += (-B != t["url"]["indexOf"]("?") ? "&" : "?") + "analysis" + "=" + z["encodeURIComponent"](e))</code>判断url背后的参数是用<code>&</code>还是<code>?</code>连接。这里最主要的是要得到<code>e</code>的值，直接返回e即可。 
到目前为止，化简后的代码为：
<details>
<summary>点击查看代码</summary>
<pre><code>function fn(t) {
var e, r = new Date() + 226 - 1661224081041, a = [];
return false,
 Object["keys"](t["params"])["forEach"](function (n) {
 if (n == "analysis")
 return !1;
 t["params"]["hasOwnProperty"](n) && a["push"](t["params"])
 }),
 a = a["sort"]()["join"](""),
 a = i["cv"](a),
 a = (a += "@#" + t["url"]["replace"](t["baseURL"], "")) + ("@#" + r) + ("@#" + 3),
 e = i["cv"](i["oZ"](a, "xyz517cda96efgh")),
 e;
}
</code></pre>
</details>
下面就是要去补全<code>i["cv"]</code>和<code>i["oZ"]</code>和这两个函数中用到的其他变量，花指令该还原就还原。 
补全和还原后的代码如下：
<details>
<summary>点击查看代码</summary>
<pre><code>function o(n) {
t = "",
['66', '72', '6f', '6d', '43', '68', '61', '72', '43', '6f', '64', '65']["forEach"](function(n) {
 t += unescape("%u00" + n)
});
var t, e = t;
return String(n)
}

function u() {
return unescape("861831832863830866861836861862839831831839862863839830865834861863837837830830837839836861835833"["replace"](/8/g, "%u00"))
}

var i = {
cv:function v(t) {
 t = encodeURIComponent(t)["replace"](/%({2})/g, function(n, t) {
 return o("0x" + t)
 });
 try {
 return btoa(t)
 } catch (n) {
 return Buffer["from"](t)["toString"]("base64")
 }
},
oZ:function h(n, t) {
 t = t || u();
 for (var e = (n = n["split"](""))["length"], r = t["length"], a = "charCodeAt", i = 0; i < e; i++)
 n = o(n(0) ^ t[(i + 10) % r](0));
 return n["join"]("")
}
};

function fn(t) {
var e, r = new Date() + 226 - 1661224081041, a = [];
return false,
 Object["keys"](t["params"])["forEach"](function (n) {
 if (n == "analysis")
 return !1;
 t["params"]["hasOwnProperty"](n) && a["push"](t["params"])
 }),
 a = a["sort"]()["join"](""),
 a = i["cv"](a),
 a = (a += "@#" + t["url"]["replace"](t["baseURL"], "")) + ("@#" + r) + ("@#" + 3),
 e = i["cv"](i["oZ"](a, "xyz517cda96efgh")),
 e;
}
</code></pre>
</details>
测试一下。
<details>
<summary>点击查看代码</summary>
<pre><code>var t = {
"url": "/rank/indexPlus/brand_id/1",
"method": "get",
"headers": {
 "common": {
 "Accept": "application/json, text/plain, */*"
 },
 "delete": {},
 "get": {},
 "head": {},
 "post": {
 "Content-Type": "application/x-www-form-urlencoded"
 },
 "put": {
 "Content-Type": "application/x-www-form-urlencoded"
 },
 "patch": {
 "Content-Type": "application/x-www-form-urlencoded"
 }
},
"params": {},
"baseURL": "https://api.qimai.cn",
"transformRequest": [
 null
],
"transformResponse": [
 null
],
"timeout": 15000,
"withCredentials": true,
"xsrfCookieName": "XSRF-TOKEN",
"xsrfHeaderName": "X-XSRF-TOKEN",
"maxContentLength": -1,
"maxBodyLength": -1
}

console.log(fn(t));
</code></pre>
</details>
运行结果如下： 
<img src="https://img2024.cnblogs.com/blog/3369335/202403/3369335-20240318120625131-1486515163.png"> 
得到了跟<code>analysis</code>参数值相似的字符串，说明我们找到了加密的逻辑，接下来就可以写python代码爬取数据了，完整的python代码和JavaScript代码如下： 
JavaScript代码：
<details>
<summary>点击查看代码</summary>
<pre><code>function o(n) {
t = "",
['66', '72', '6f', '6d', '43', '68', '61', '72', '43', '6f', '64', '65']["forEach"](function(n) {
 t += unescape("%u00" + n)
});
var t, e = t;
return String(n)
}

function u() {
return unescape("861831832863830866861836861862839831831839862863839830865834861863837837830830837839836861835833"["replace"](/8/g, "%u00"))
}

var i = {
cv:function v(t) {
 t = encodeURIComponent(t)["replace"](/%({2})/g, function(n, t) {
 return o("0x" + t)
 });
 try {
 return btoa(t)
 } catch (n) {
 return Buffer["from"](t)["toString"]("base64")
 }
},
oZ:function h(n, t) {
 t = t || u();
 for (var e = (n = n["split"](""))["length"], r = t["length"], a = "charCodeAt", i = 0; i < e; i++)
 n = o(n(0) ^ t[(i + 10) % r](0));
 return n["join"]("")
}
};

function fn(t) {
var e, r = new Date() + 226 - 1661224081041, a = [];
return false,
 Object["keys"](t["params"])["forEach"](function (n) {
 if (n == "analysis")
 return !1;
 t["params"]["hasOwnProperty"](n) && a["push"](t["params"])
 }),
 a = a["sort"]()["join"](""),
 a = i["cv"](a),
 a = (a += "@#" + t["url"]["replace"](t["baseURL"], "")) + ("@#" + r) + ("@#" + 3),
 e = i["cv"](i["oZ"](a, "xyz517cda96efgh")),
 e;
}

function final(url, pm) {
var params = {
 "url": url,
 "baseURL": "https://api.qimai.cn",
 "params":pm,
};
return fn(params);
}
</code></pre>
</details>
python代码：
<details>
<summary>点击查看代码</summary>
<pre><code>import subprocess
from functools import partial

subprocess.Popen = partial(subprocess.Popen, encoding="utf-8")

import execjs
import json
import requests

f = open("拦截器逻辑二.js", mode="r", encoding="utf-8")
js = execjs.compile(f.read())
f.close()

data = {
"brand": "all",
"country": "cn",
"date": "2024-03-18",
"device": "iphone",
"genre": "36",
"page": 2,
}

host = "https://api.qimai.cn"
url = "/rank/indexPlus/brand_id/1"
analysis = js.call("final", url, data)
final_url = host+url+"?analysis=" + analysis
# print(final_url)

session = requests.session()
session.headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.0.0 "
"Safari/537.36",
}
# 加载最开始的cookie
session.get("https://www.qimai.cn/rank")

# 经过测试，这玩意没什么用
session.cookies["qm_check"] = "A1sdRUIQChtxen8pI0dAMRcOUFseEHBeQF0JTjVBWCwycRd1QlhAXFEGFUdeS0laHQdKAAkABAsgXyVBWD0TR1JRRAp0BQlFEBQ3TSZKFUdBbwxvBBRFIlQsSUhTFxsQU1FVV1NHXEVYVElWBRsCHAkSSQ%3D%3D"

# 开干
resp = session.get(final_url)
decoded_text = bytes(resp.text, 'utf-8').decode('unicode_escape')
print(decoded_text)
</code></pre>
</details>
运行python代码结果如下： 
<img src="https://img2024.cnblogs.com/blog/3369335/202403/3369335-20240318121436037-91554695.png"> 
成功拿到数据。
<h1 id="补充">补充</h1>
在python代码中可以看到添加了cookie，虽然测试后得知这个参数没有用，但是出于学习的目的，也可以来看一下这个参数的加密逻辑。 
全局搜索<code>qm_check</code>。 
<img src="https://img2024.cnblogs.com/blog/3369335/202403/3369335-20240318133813382-2132141647.png"> 
一个都没搜到，这时候就要用到上节讲过的webhook工具了，选择<code>Hook Setcookie</code>后，刷新界面。 
<img src="https://img2024.cnblogs.com/blog/3369335/202403/3369335-20240318134028762-1201446886.png"> 
看到<code>val</code>中还没有出现<code>qm_check</code>，一直放，直到看到<code>qm_check</code>。 
<img src="https://img2024.cnblogs.com/blog/3369335/202403/3369335-20240318134131155-386712871.png"> 
这里的<code>qm_check</code>已经被加密了，要想找到加密逻辑，就得往上看，通过Call Stack往上找。 
<img src="https://img2024.cnblogs.com/blog/3369335/202403/3369335-20240318134250401-752570423.png">
<img src="https://img2024.cnblogs.com/blog/3369335/202403/3369335-20240318134340064-660968428.png"> 
<code>v(p(z(n), s))</code>这段代码执行得到的结果就是加密后的字符串，并且<code>n</code>和<code>s</code>都是明文，所以加密逻辑肯定跟<code>v</code>、<code>p</code>、<code>z</code>这几个有关。 
<img src="https://img2024.cnblogs.com/blog/3369335/202403/3369335-20240318134550557-216476847.png"> 
找到这几个函数的实现代码。 
<code>z</code>相当于<code>JSON.stringify</code> 
<code>p</code>函数的实现如下。 
<img src="https://img2024.cnblogs.com/blog/3369335/202403/3369335-20240318134902898-368800085.png"> 
花指令处理过后如下：（u()函数生成的是一个固定的字符串）
<details>
<summary>点击查看代码</summary>
<pre><code>function p(n, t) {
t = t || 'a12c0fa6ab9119bc90e4ac7700796a53';
for (var e = (n = n["split"](""))["length"], r = t["length"], a = "charCodeAt", i = 0; i < e; i++)
 n = o(n(0) ^ t[(i + 10) % r](0));
return n["join"]("")
}
</code></pre>
</details>
<code>v</code>函数的实现如下 
<img src="https://img2024.cnblogs.com/blog/3369335/202403/3369335-20240318135214664-1953984148.png"> 
花指令处理过后如下：
<details>
<summary>点击查看代码</summary>
<pre><code>function v(t) {
t = encodeURIComponent(t)["replace"](/%({2})/g, function(n, t) {
 return o("0x" + t)
});
try {
 return btoa(t)
} catch (n) {
 return Buffer["from"](t)["toString"]("base64")
}
}
</code></pre>
</details>
把这两个函数用到的其他函数找到，补充完整，整体代码如下：
<details>
<summary>点击查看代码</summary>
<pre><code>function p(n, t) {
t = t || 'a12c0fa6ab9119bc90e4ac7700796a53';
for (var e = (n = n["split"](""))["length"], r = t["length"], a = "charCodeAt", i = 0; i < e; i++)
 n = o(n(0) ^ t[(i + r) % r](0));
return n["join"]("")
}

function v(t) {
t = encodeURIComponent(t)["replace"](/%({2})/g, function(n, t) {
 return o("0x" + t)
});
try {
 return btoa(t)
} catch (n) {
 return Buffer["from"](t)["toString"]("base64")
}
}

function o(n) {
t = "",
['66', '72', '6f', '6d', '43', '68', '61', '72', '43', '6f', '64', '65']["forEach"](function(n) {
 t += unescape("%u00" + n)
});
var t, e = t;
return String(n)
}

var n = {
"gpu": "ANGLE (Intel, Intel(R) UHD Graphics 630 (0x00003E9B) Direct3D11 vs_5_0 ps_5_0, D3D11)",
"check": "0,0,0,0,0"
};
var s = "xyz57209048efgh";
console.log(v(p(JSON.stringify(n),s)));
</code></pre>
</details>
运行得到的结果如下： 
<img src="https://img2024.cnblogs.com/blog/3369335/202403/3369335-20240318140533845-811327157.png"> 
页面上存储的值如下： 
<img src="https://img2024.cnblogs.com/blog/3369335/202403/3369335-20240318140601157-718243304.png"> 
两个值相同，说明加密逻辑没有找错。
<h1 id="完结">完结</h1> 
来源：https://www.cnblogs.com/sbhglqy/p/18080090

頁: [1]

圆梦公社's Archiver

JavaScript逆向之七麦数据实战