Python网络请求库httpx详解

古巷幼猫 發表於 2021-9-10 09:59:00

<h1 id="简介">简介</h1>
<p>httpx是Python新一代的网络请求库，它包含以下特点</p>
<ul>
<li>基于Python3的功能齐全的http请求模块</li>
<li>既能发送同步请求，也能发送异步请求</li>
<li>支持HTTP/1.1和HTTP/2</li>
<li>能够直接向WSGI应用程序或者ASGI应用程序发送请求</li>
</ul>
<h1 id="安装">安装</h1>
<p>httpx需要Python3.6+（使用异步请求需要Python3.8+）</p>
<pre><code class="language-shell">pip3 install httpx
或
python3 -m pip install httpx
</code></pre>
<p>如果需要使用HTTP/2，则需要安装http2的相关依赖</p>
<pre><code class="language-shell">pip3 install httpx
或
python3 -m pip install httpx
</code></pre>
<h1 id="使用">使用</h1>
<h2 id="简单使用">简单使用</h2>
<p>httpx与<code>requests</code>库的基本使用方法几乎是一模一样的</p>
<pre><code class="language-python">import httpx

r = httpx.get('https://httpbin.org/get')
print(r)# <Response >
</code></pre>
<p>类似的，我们也可以使用<code>POST</code>, <code>PUT</code>, <code>DELETE</code>, <code>HEAD</code>和<code>OPTIONS</code>等请求方法，如下</p>
<pre><code class="language-python">r = httpx.post('https://httpbin.org/post', data={'key': 'value'})
r = httpx.put('https://httpbin.org/put', data={'key': 'value'})
r = httpx.delete('https://httpbin.org/delete')
r = httpx.head('https://httpbin.org/get')
r = httpx.options('https://httpbin.org/get')
</code></pre>
<p>带有请求头和请求参数的请求</p>
<pre><code class="language-python">import httpx

headers = {'user-agent': 'my-app/1.0.0'}
params = {'key1': 'value1', 'key2': 'value2'}
url = 'https://httpbin.org/get'
r = httpx.get(url, headers=headers, params=params)
print(r)
print(r.status_code)# 状态码
print(r.encoding)# 文本编码
print(r.text)
print(r.json())
</code></pre>
<p>结果如下</p>
<pre><code><Response >
200
ascii
{
"args": {
"key1": "value1",
"key2": "value2"
},
"headers": {
"Accept": "*/*",
"Accept-Encoding": "gzip, deflate",
"Host": "httpbin.org",
"User-Agent": "my-app/1.0.0",
"X-Amzn-Trace-Id": "Root=1-6139b788-2fd67d5627a5f6de346e154a"
},
"origin": "113.110.227.200",
"url": "https://httpbin.org/get?key1=value1&key2=value2"
}

{'args': {'key1': 'value1', 'key2': 'value2'}, 'headers': {'Accept': '*/*', 'Accept-Encoding': 'gzip, deflate', 'Host': 'httpbin.org', 'User-Agent': 'my-app/1.0.0', 'X-Amzn-Trace-Id': 'Root=1-6139b788-2fd67d5627a5f6de346e154a'}, 'origin': '113.110.227.200', 'url': 'https://httpbin.org/get?key1=value1&key2=value2'}
</code></pre>
<p>请求带有cookies</p>
<pre><code class="language-python">import httpx

url = 'http://httpbin.org/cookies'
cookies = {'color': 'green'}
r = httpx.get(url, cookies=cookies)
print(r.json())# {'cookies': {'color': 'green'}}
</code></pre>
<p>设置超时时间</p>
<pre><code class="language-python">import httpx

r = httpx.get('http://httpbin.org', timeout=0.001)
print(r)
</code></pre>
<p>超过设置时间则报<code>httpx.ConnectTimeout: timed out</code></p>
<h2 id="高级用法">高级用法</h2>
<p>我们使用上面的请求方式时，httpx每次发送请求都需要建立一个新的连接，然而随着请求的数量增加，整个程序的请求效率就会变得很低。</p>
<p>httpx提供了<code>Client</code>来解决以上问题，<code>Client</code>是基于HTTP连接池实现的，这意味着当你对一个网站发送多次请求的时候，<code>Client</code>会保持原有的TCP连接，从而提升程序的执行效率。</p>
<h3 id="使用client发送请求">使用Client发送请求</h3>
<p>创建一个client对象，使用该对象去做相应的请求</p>
<pre><code class="language-python">import httpx

with httpx.Client() as client:
headers = {'X-Custom': 'value'}
r = client.get('https://example.com', headers=headers)
print(r.text)
</code></pre>
<h3 id="跨请求共享配置">跨请求共享配置</h3>
<p>我们可以将<code>headers</code>、<code>cookies</code>、<code>params</code>等参数放在<code>http.Client()</code>中，在<code>Client</code>下的请求共享这些配置参数</p>
<pre><code class="language-python">import httpx

headers1 = {'x-auth': 'from-client'}
params1 = {'client_id': '1234'}
url = 'https://example.com'
with httpx.Client(headers=headers1, params=params1) as client:
headers2 = {'x-custom': 'from-request'}
params2 = {'request_id': '4321'}
r1 = client.get(url)
print(r1.request.headers)
r2 = client.get(url, headers=headers2, params=params2)
print(r2.request.headers)
</code></pre>
<p>结果如下</p>
<pre><code>Headers({'host': 'example.com', 'accept': '*/*', 'accept-encoding': 'gzip, deflate', 'connection': 'keep-alive', 'user-agent': 'python-httpx/0.19.0', 'x-auth': 'from-client'})
Headers({'host': 'example.com', 'accept': '*/*', 'accept-encoding': 'gzip, deflate', 'connection': 'keep-alive', 'user-agent': 'python-httpx/0.19.0', 'x-auth': 'from-client', 'x-custom': 'from-request'})
</code></pre>
<p>可以看出，r1的请求头包含<code>{'x-auth': 'from-client'}</code>, r2虽然配置了headers2，但由于里面的headers1和headers2的参数不同，<code>Client</code>会合并这两个headers的参数作为一个新的headers（如果参数相同，则headers2的参数会覆盖headers1的参数）。</p>
<h3 id="http代理">HTTP代理</h3>
<p>httpx可以通过设置<code>proxies</code>参数来使用http代理，我们也可以使用不同的代理来分别处理http和https协议的请求，假设有如下两个代理</p>
<pre><code class="language-python">import httpx

proxies = {
'http://': 'http://localhost:8080',# 代理1
'https://': 'http://localhost:8081',# 代理2
}
url = 'https://example.com'
with httpx.Client(proxies=proxies) as client:
r1 = client.get(url)
print(r1)
</code></pre>
<p>上面的代理只是示范，实际场景下请替换成有效的ip代理</p>
<p>还有一点需要注意的是，httpx的代理参数<code>proxies</code>只能在<code>httpx.Client()</code>中添加，<code>client.get()</code>是没有这个参数的。</p>
<h3 id="超时处理">超时处理</h3>
<p>默认情况下，httpx到处都做了严格的超时处理，默认时间为5秒，超过5秒无响应则报<code>TimeoutException</code></p>
<pre><code class="language-python"># 普通请求:
httpx.get('http://example.com/api/v1/example', timeout=10.0)

# client实例:
with httpx.Client() as client:
client.get("http://example.com/api/v1/example", timeout=10.0)
</code></pre>
<p>或者关闭超时处理</p>
<pre><code class="language-python"># 普通请求:
httpx.get('http://example.com/api/v1/example', timeout=None)

# client实例:
with httpx.Client() as client:
client.get("http://example.com/api/v1/example", timeout=None)
</code></pre>
<h3 id="ssl验证">SSL验证</h3>
<p>当请求https协议的链接时，发出的请求需要验证所请求主机的身份，因此需要SSL证书来取得服务器的信任后。</p>
<p>如果要使用自定义的CA证书，则可以使用<code>verify</code>参数</p>
<pre><code class="language-python">import httpx

r = httpx.get("https://example.org", verify="path/to/client.pem")
</code></pre>
<p>或者你可以完全禁用SSL验证（不推荐）。</p>
<pre><code class="language-python">import httpx

r = httpx.get("https://example.org", verify=False)
</code></pre>
<h2 id="异步支持">异步支持</h2>
<p>默认情况下，httpx使用标准的同步请求方式，如果需要的话，我们也可以使用它提供的异步client来发送相关请求。</p>
<p>使用异步client比使用多线程发送请求更加高效，更能体现明显的性能优势，并且它还支持WebSocket等长网络连接。</p>
<h3 id="异步请求">异步请求</h3>
<p>使用async/await语句来进行异步操作，创建一个<code>httpx.AsyncClient()</code>对象</p>
<pre><code class="language-python">import asyncio
import httpx

async def main():
async with httpx.AsyncClient() as client:# 创建一个异步client
   r = await client.get('https://www.example.com/')
   print(r)

if __name__ == '__main__':
asyncio.run(main())

</code></pre>
<h3 id="同步请求与异步请求的比较">同步请求与异步请求的比较</h3>
<p>我们来尝试使用同步和异步的方法进行请求，对比两种不同的方法的效率情况。</p>
<p><strong>同步请求</strong></p>
<pre><code class="language-python">import time
import httpx

def main():
with httpx.Client() as client:
   for i in range(300):
         res = client.get('https://www.example.com')
         print(f'第{i + 1}次请求，status_code = {res.status_code}')

if __name__ == '__main__':
start = time.time()
main()
end = time.time()
print(f'同步发送300次请求，耗时：{end - start}')
</code></pre>
<p>同步发送300次请求，耗时：49.65340781211853</p>
<p><strong>异步请求</strong></p>
<pre><code class="language-python">import asyncio
import time
import httpx

async def req(client, i):
res = await client.get('https://www.example.com')
print(f'第{i + 1}次请求，status_code = {res.status_code}')
return res

async def main():
async with httpx.AsyncClient() as client:
   task_list = []# 任务列表
   for i in range(300):
         res = req(client, i)
         task = asyncio.create_task(res)# 创建任务
         task_list.append(task)
   await asyncio.gather(*task_list)# 收集任务

if __name__ == '__main__':
start = time.time()
asyncio.run(main())
end = time.time()
print(f'异步发送300次请求，耗时：{end - start}')
</code></pre>
<p>异步发送300次请求，耗时：2.5227813720703125 （由于是异步执行的，所以打印的i值是无序的）</p>
<p>从两个例子可以看出，异步请求明显比同步请求的效率高很多。</p>
<p>以上就是httpx库的基本使用方法，想了解更多可以去httpx官方文档中查看。</p><br><br>
来源：https://www.cnblogs.com/blueberry-mint/p/15250125.html

頁: [1]

圆梦公社's Archiver

Python网络请求库httpx详解