今天看啥  ›  专栏  ›  一只要飞的猪

了解HTTP协议

一只要飞的猪  · 掘金  ·  · 2019-04-20 09:44
阅读 6

了解HTTP协议

什么是HTTP协议

  • HyperText Transfer Protocol超文本传输协议
  • The Hypertext Transfer Protocol(HTTP) is a stateless(无状态) application-level protocl for distributed(分布式), collaborative(协作式),hypertext information systems(超文本信息系统)(referred:wikipedia)

Chrome开发者工具

ctrl+shift+I

curl命令访问网站

curl -v http://baidu.com > tmp.txt

* Rebuilt URL to: http://baidu.com/
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 123.125.114.144...
* TCP_NODELAY set
* Connected to baidu.com (123.125.114.144) port 80 (#0)
> GET / HTTP/1.1
> Host: baidu.com
> User-Agent: curl/7.55.1
> Accept: */*
>
< HTTP/1.1 200 OK
< Date: Sat, 20 Apr 2019 08:15:07 GMT
< Server: Apache
< Last-Modified: Tue, 12 Jan 2010 13:48:00 GMT
< ETag: "51-47cf7e6ee8400"
< Accept-Ranges: bytes
< Content-Length: 81
< Cache-Control: max-age=86400
< Expires: Sun, 21 Apr 2019 08:15:07 GMT
< Connection: Keep-Alive
< Content-Type: text/html
<
{ [81 bytes data]
100    81  100    81    0     0     81      0  0:00:01 --:--:--  0:00:01   470
* Connection #0 to host baidu.com left intact
复制代码

Request

> GET / HTTP/1.1
# StartLine: 方法 地址 协议
> Host: baidu.com
> User-Agent: curl/7.55.1
> Accept: */*
# Headers:key: value
复制代码

Response

< HTTP/1.1 200 OK
# Start Line: 状态码 具体解释
< Date: Sat, 20 Apr 2019 08:15:07 GMT
< Server: Apache
< Last-Modified: Tue, 12 Jan 2010 13:48:00 GMT
< ETag: "51-47cf7e6ee8400"
< Accept-Ranges: bytes
< Content-Length: 81
< Cache-Control: max-age=86400
< Expires: Sun, 21 Apr 2019 08:15:07 GMT
< Connection: Keep-Alive
< Content-Type: text/html
# Headers: key: value
复制代码

Message Body

<html>
<meta http-equiv="refresh" content="0;url=http://www.baidu.com/">
</html>
复制代码

简单小程序

  • urllib
  • requests
  1. urlliburllib2是相互独立的模块(在python3.3后urllib2已经不能再用,只能用urllib.request来代替)
  2. requests库使用了urllib3(多次请求重复使用一个socket)
  • urllib
import urllib.request as urllib2
def use_simple_urllib2():
    url = 'http://httpbin.org/ip'
    response = urllib2.urlopen(url)
    print('>>>Response Headers')
    print(response.info())
    print('>>>Response Body')
    #获取返回内容,readlines()得到的是二进制,需要转化为字符串输出
    print(response.read().decode())
>>>Response Headers
Access-Control-Allow-Credentials: true
Access-Control-Allow-Origin: *
Content-Type: application/json
Date: Sat, 20 Apr 2019 08:38:52 GMT
Referrer-Policy: no-referrer-when-downgrade
Server: nginx
X-Content-Type-Options: nosniff
X-Frame-Options: DENY
X-XSS-Protection: 1; mode=block
Content-Length: 51
Connection: Close

>>>Response Body
{
  "origin": "122.205.61.100, 122.205.61.100"
}
复制代码
def use_param_urllib2():
    url_get = 'http://httpbin.org/get'
    param = {'param1': 'hello', 'param2': 'world'}
    param = urllib.parse.urlencode(param)
    print('>>>Resquest Params')
    print(param)
    response = urllib2.urlopen('?'.join([url_get, '%s']) % param)
    print('>>>Response Headers')
    print(response.info())
    print('>>>Status Code')
    print(response.getcode())
    print('>>>Response Body')
    #获取返回内容,readlines()得到的是二进制,需要转化为字符串输出
    print(response.read().decode())
>>>Resquest Params
param2=world&param1=hello
>>>Response Headers
Access-Control-Allow-Credentials: true
Access-Control-Allow-Origin: *
Content-Type: application/json
Date: Sat, 20 Apr 2019 09:04:11 GMT
Referrer-Policy: no-referrer-when-downgrade
Server: nginx
X-Content-Type-Options: nosniff
X-Frame-Options: DENY
X-XSS-Protection: 1; mode=block
Content-Length: 299
Connection: Close

>>>Status Code
200
>>>Response Body
{
  "args": {
    "param1": "hello", 
    "param2": "world"
  }, 
  "headers": {
    "Accept-Encoding": "identity", 
    "Host": "httpbin.org", 
    "User-Agent": "Python-urllib/3.5"
  }, 
  "origin": "122.205.61.100, 122.205.61.100", 
  "url": "https://httpbin.org/get?param2=world&param1=hello"
}
复制代码
  • request
def use_simple_request():
    url = 'http://httpbin.org/ip'
    response = requests.get(url)
    print('>>>Response Headers')
    print(response.headers)
    print('>>>Response Body')
    print(response.text)
复制代码
def use_param_request():
    url_get = 'http://httpbin.org/ip'
    param = {'param1': 'hello', 'param2': 'world'}
    print('>>>Resquest Params')
    print(param)
    response = requests.get(url_get,params=param)
    print('>>>Response Headers')
    print(response.headers)
    print('>>>Status Code')
    print(response.status_code)
    print(response.reason)
    print('>>>Response Body')
    print(response.json())
>>>Resquest Params
{'param2': 'world', 'param1': 'hello'}
>>>Response Headers
{'Access-Control-Allow-Origin': '*', 'X-XSS-Protection': '1; mode=block', 'Content-Type': 'application/json', 'Access-Control-Allow-Credentials': 'true', 'X-Content-Type-Options': 'nosniff', 'Content-Length': '58', 'X-Frame-Options': 'DENY', 'Server': 'nginx', 'Date': 'Sat, 20 Apr 2019 09:13:01 GMT', 'Connection': 'keep-alive', 'Content-Encoding': 'gzip', 'Referrer-Policy': 'no-referrer-when-downgrade'}
>>>Status Code
200
OK
>>>Response Body
{'origin': '115.156.141.224, 115.156.141.224'}

复制代码



原文地址:访问原文地址
快照地址: 访问文章快照