问题
手上有一台温湿度传感器,网线连接,内部有http服务,浏览器直接访问完全没问题,但是使用python的http,request库都报错,使用apipost发现也是错误
1 2 3 4 5
| import requests
response = requests.get('http://192.168.31.188')
print(response.content)
|
1
| ConnectionError: ('Connection aborted.', BadStatusLine('HTTP/1.1 0 -\r\n',))
|
思路
刚开始以为是请求头之类的问题,但一个小传感器搞复杂的安全策略完全没必要啊,后来通过wireshark抓包发现他的http响应是这样的,里面已经包含了我需要的数据,但是没有办法被正确处理:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53
| <!DOCTYPE HTML><html> <head> <title>TH</title> </head> <body> <div align="center" > <table style="width: 960px;" border="1" cellpadding="2" cellspacing="2"> <tbody> <tr> <td width="250" style="font-family: Verdana; font-weight: bold; font-style: italic; background-color: rgb(51, 255, 255); text-align: center;"> <span style="font-weight: bold;"> <h4 align="center"> <big style="font-weight: bold;"><strong> <span style="font-style: italic;"><br/>Air Conditioning Controller</span> </strong></big> <small style="font-family: Verdana;"> <big style="font-weight: bold;"> <span style="color: rgb(51, 51, 255);"><br/>Ethernet IP Type</span> </big> </small> </h4> </span> </td> </tr> </tbody> </table> <form action="menu.html" method="GET" name="sysconfig"> <table style="width: 960px; " border="1" cellpadding="2" cellspacing="2"> <tbody> <tr> <td width="250" style="font-family: Verdana; font-weight: bold; font-style: italic; background-color: rgb(51, 255, 255); text-align: center;"> <big> <p>Temprature: 26.7 ℃</p> <p>Humidity: 42.9 %</p> <p>Lastcmd: 0</p> <p>Device ID: 1482184792</p> <p>Bus Addr: 1</p> <input id="submit" type="submit" name="checkinSubmit" value="Click to enter the configuration menu list" /> <small><br/> </small> </big> </td> </tr> </tbody> </table> </form> <hr/> <font class="footmsg"> <span style="color: silver;">All Rights Reserved@2023</span> </font> </div> </body> </html>
|
这是其他网站的http响应,经过对比之后发现规范的http响应应该以HTTP/1.1 200 OK开头,这时候再看错误似乎就知道什么意思了,badstatusline不就是状态行错误嘛
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80
| HTTP/1.1 200 OK Date: Thu, 01 Aug 2024 08:58:32 GMT Content-Type: text/plain; charset=utf-8 Content-Length: 2417 Connection: keep-alive Set-Cookie: tgw_l7_route=a699e39024d0403234b82455ae41cef1; Expires=Thu, 01-Aug-2024 09:28:32 GMT; Path=/
<sg type="0" au="https://www.sogou.com/web?bddn=9027558936136805&brnd=6014570413749830&" eu="&lxoq=baidu"> <e> <l h="1" d="0" vr="0" r="0" s="1" t="1"> <t t="0" u="query=" sw="m.baidu.com" r="0-11" tp="0"> <w s=";1;;;;;;"><![CDATA[m.baidu.com]]></w> </t> </l> </e> <e> <l h="1" d="0" vr="0" r="0" s="1" t="1"> <t t="0" u="query=" sw="baidu.com" r="0-9" tp="0"> <w s=";1;;;;;;"><![CDATA[baidu.com]]></w> </t> </l> </e> <e> <l h="1" d="0" vr="0" r="0" s="1" t="1"> <t t="0" u="query=" sw="%B0%D9%B6%C8" r="0-2" tp="0"> <w s=";1;;;;;;"><![CDATA[....]]></w> </t> </l> </e> <e> <l h="1" d="0" vr="0" r="0" s="1" t="1"> <t t="0" u="query=" sw="%B0%D9%B6%C8%D2%BB%CF%C2" r="0-4" tp="0"> <w s=";1;;;;;;"><![CDATA[........]]></w> </t> </l> </e> <e> <l h="1" d="0" vr="0" r="0" s="1" t="1"> <t t="0" u="query=" sw="baidu%B0%D9%B6%C8%CA%D7%D2%B3" r="0-9" tp="0"> <w s=";1;;;;;;"><![CDATA[baidu........]]></w> </t> </l> </e> <e> <l h="1" d="0" vr="0" r="0" s="1" t="1"> <t t="0" u="query=" sw="%B0%D9%B6%C8%B7%AD%D2%EB" r="0-4" tp="0"> <w s=";1;;;;;;"><![CDATA[........]]></w> </t> </l> </e> <e> <l h="1" d="0" vr="0" r="0" s="1" t="1"> <t t="0" u="query=" sw="188baidu" r="0-8" tp="0"> <w s=";1;;;;;;"><![CDATA[188baidu]]></w> </t> </l> </e> <e> <l h="1" d="0" vr="0" r="0" s="1" t="1"> <t t="0" u="query=" sw="%B0%D9%B6%C8%B5%D8%CD%BC" r="0-4" tp="0"> <w s=";1;;;;;;"><![CDATA[........]]></w> </t> </l> </e> <e> <l h="1" d="0" vr="0" r="0" s="1" t="1"> <t t="0" u="query=" sw="baidu%B0%D9%B6%C8%D2%BB%CF%C2%B9%D9%CD%F8" r="0-11" tp="0"> <w s=";1;;;;;;"><![CDATA[baidu............]]></w> </t> </l> </e> <e> <l h="1" d="0" vr="0" r="0" s="1" t="1"> <t t="0" u="query=" sw="%B0%D9%B6%C8%CE%C4%BF%E2" r="0-4" tp="0"> <w s=";1;;;;;;"><![CDATA[........]]></w> </t> </l> </e> </sg>
|
http响应不规范,这就导致经过高级封装的一些操作http的库没有办法正确解析响应的内容,这时候就需要采用更原始的方法了,也就是不考虑http是否规范,直接在tcp层面获取原始tcp报文,再进行解析:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
| import socket
host = '192.168.31.188' port = 80
client_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
client_socket.connect((host, port))
request = f"GET / HTTP/1.1\r\nHost: {host}\r\nConnection: close\r\n\r\n"
client_socket.sendall(request.encode())
response = b"" buffer_size = 1024 while True: buffer = client_socket.recv(buffer_size) if not buffer: break response += buffer
client_socket.close()
print(response.decode('utf-8', errors='ignore'))
|
这样就能正确获取到想要的http响应内容了。