通过Python套接字/WebSocket客户端发送/接收WebSocket消息

8
我写了一个简单的WebSocket客户端。我使用了在SO上找到的代码,链接如下:如何在服务器端发送和接收WebSocket消息?
我使用的是Python 2.7,并且我的服务器是echo.websocket.org,使用TCP 80端口。基本上,我认为我在接收消息时存在问题。(或者发送也有问题?)
至少我确定握手阶段没有问题,因为我收到了良好的握手响应:
HTTP/1.1 101 Web Socket Protocol Handshake
Access-Control-Allow-Credentials: true
Access-Control-Allow-Headers: content-type
Access-Control-Allow-Headers: authorization
Access-Control-Allow-Headers: x-websocket-extensions
Access-Control-Allow-Headers: x-websocket-version
Access-Control-Allow-Headers: x-websocket-protocol
Access-Control-Allow-Origin: http://example.com
Connection: Upgrade
Date: Tue, 02 May 2017 21:54:31 GMT
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=
Server: Kaazing Gateway
Upgrade: websocket

我的代码:

#!/usr/bin/env python
import socket

def encode_text_msg_websocket(data):
    bytesFormatted = []
    bytesFormatted.append(129)

    bytesRaw = data.encode()
    bytesLength = len(bytesRaw)

    if bytesLength <= 125:
        bytesFormatted.append(bytesLength)
    elif 126 <= bytesLength <= 65535:
        bytesFormatted.append(126)
        bytesFormatted.append((bytesLength >> 8) & 255)
        bytesFormatted.append(bytesLength & 255)
    else:
        bytesFormatted.append(127)
        bytesFormatted.append((bytesLength >> 56) & 255)
        bytesFormatted.append((bytesLength >> 48) & 255)
        bytesFormatted.append((bytesLength >> 40) & 255)
        bytesFormatted.append((bytesLength >> 32) & 255)
        bytesFormatted.append((bytesLength >> 24) & 255)
        bytesFormatted.append((bytesLength >> 16) & 255)
        bytesFormatted.append((bytesLength >> 8) & 255)
        bytesFormatted.append(bytesLength & 255)

    bytesFormatted = bytes(bytesFormatted)
    bytesFormatted = bytesFormatted + bytesRaw
    return bytesFormatted


def dencode_text_msg_websocket(stringStreamIn):
    byteArray = [ord(character) for character in stringStreamIn]
    datalength = byteArray[1] & 127
    indexFirstMask = 2
    if datalength == 126:
        indexFirstMask = 4
    elif datalength == 127:
        indexFirstMask = 10
    masks = [m for m in byteArray[indexFirstMask: indexFirstMask + 4]]
    indexFirstDataByte = indexFirstMask + 4
    decodedChars = []
    i = indexFirstDataByte
    j = 0
    while i < len(byteArray):
        decodedChars.append(chr(byteArray[i] ^ masks[j % 4]))
        i += 1
        j += 1
    return ''.join(decodedChars)

# connect 
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.connect((socket.gethostbyname('echo.websocket.org'), 80))

# handshake
handshake = 'GET / HTTP/1.1\r\nHost: echo.websocket.org\r\nUpgrade: websocket\r\nConnection: Upgrade\r\nSec-WebSocket-Key: gfhjgfhjfj\r\nOrigin: http://example.com\r\nSec-WebSocket-Protocol: echo\r\n' \
        'Sec-WebSocket-Version: 13\r\n\r\n'
sock.send(handshake)
print sock.recv(1024)

# send test msg
msg = encode_text_msg_websocket('hello world!')
sock.sendall(msg)

# receive it back
response = dencode_text_msg_websocket(sock.recv(1024))
print '--%s--' % response

sock.close()

这里有什么问题?握手后变得复杂起来。 dencode_text_msg_websocket 方法返回一个空字符串,但它应该返回我发送到服务器的相同字符串,即 hello world!
我不想使用库(我知道如何使用它们)。问题是在没有使用库的情况下,只使用套接字实现相同的功能。
我只想向 echo.websocket.org 服务器 发送一条消息并接收响应。我不想修改标头,只需构建与此服务器使用的标头相同的标头。我使用 Wireshark 检查了它们的外观,并尝试使用 Python 构建相同的数据包。
对于下面的测试,我使用了我的浏览器:
未屏蔽的数据,从服务器到客户端: enter image description here 屏蔽的数据,从客户端到服务器: enter image description here

您的套接字层级很高,无法访问所有标题以进行重新配置。只能在准备好使用的套接字连接上选择TCP或UDP。 - dsgdfg
@dsgdfg:我想我没有理解你的意思。我只想向echo.websocket.org服务器发送一条消息,仅此而已。我不想修改头文件,只是构建与该服务器使用的相同头文件。我使用Wireshark检查了它们应该看起来像什么,并尝试使用Python构建相同的数据包。请查看我的编辑。 - yak
你的代码和你基于的代码在解码定义上有一个根本性的区别。你没有将输入的 byteArray = stringStreamIn 转换,而是仅依赖于转换单个字符来获取长度。原始代码将整个输入字符串转换为 byteArray = [ord(character) for character in stringStreamIn] - Rolf of Saxony
@RolfofSaxony:即使我按照你的建议更改了代码,最后一个print输出一个空字符串作为响应。 - yak
我怀疑是数据编码问题,看一下websocket-client中的Python代码。可以使用命令sudo pip install websocket-client安装该库。 - Rolf of Saxony
2个回答

3
我已经将您的代码修改,使其至少能发送回复并接收答案,通过改变编码方式使用chr()插入字节字符串而不是十进制数到头部。我没有改变解码方式,但这里的另一个答案有解决方案。
真正的核心细节在这里详细说明:https://www.rfc-editor.org/rfc/rfc6455.txt
其中详细说明了你需要做什么。
#!/usr/bin/env python
import socket
def encode_text_msg_websocket(data):
    bytesFormatted = []
    bytesFormatted.append(chr(129))
    bytesRaw = data.encode()
    bytesLength = len(bytesRaw)
    if bytesLength <= 125:
        bytesFormatted.append(chr(bytesLength))
    elif 126 <= bytesLength <= 65535:
        bytesFormatted.append(chr(126))
        bytesFormatted.append((chr(bytesLength >> 8)) & 255)
        bytesFormatted.append(chr(bytesLength) & 255)
    else:
        bytesFormatted.append(chr(127))
        bytesFormatted.append(chr((bytesLength >> 56)) & 255)
        bytesFormatted.append(chr((bytesLength >> 48)) & 255)
        bytesFormatted.append(chr((bytesLength >> 40)) & 255)
        bytesFormatted.append(chr((bytesLength >> 32)) & 255)
        bytesFormatted.append(chr((bytesLength >> 24)) & 255)
        bytesFormatted.append(chr((bytesLength >> 16)) & 255)
        bytesFormatted.append(chr((bytesLength >> 8)) & 255)
        bytesFormatted.append(chr(bytesLength) & 255)
    send_str = ""
    for i in bytesFormatted:
        send_str+=i
    send_str += bytesRaw
    return send_str

# connect 
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.settimeout(5.0)
try:
    sock.connect((socket.gethostbyname('ws.websocket.org'), 80))
except:
    print "Connection failed"
handshake = '\
GET /echo HTTP/1.1\r\n\
Host: echo.websocket.org\r\n\
Upgrade: websocket\r\n\
Connection: Upgrade\r\n\
Sec-WebSocket-Key: x3JJHMbDL1EzLkh9GBhXDw==\r\n\
Origin: http://example.com\r\n\
WebSocket-Protocol: echo\r\n\
Sec-WebSocket-Version: 13\r\n\r\n\
'
sock.send(bytes(handshake))
data = sock.recv(1024).decode('UTF-8')
print data

# send test msg
msg = encode_text_msg_websocket('Now is the winter of our discontent, made glorious Summer by this son of York')
print "Sent: ",repr(msg)
sock.sendall(bytes(msg))
# receive it back
response = sock.recv(1024)
#decode not sorted so ignore the first 2 bytes
print "\nReceived: ", response[2:].decode()
sock.close()

结果:

HTTP/1.1 101 Web Socket Protocol Handshake
Access-Control-Allow-Credentials: true
Access-Control-Allow-Headers: content-type
Access-Control-Allow-Headers: authorization
Access-Control-Allow-Headers: x-websocket-extensions
Access-Control-Allow-Headers: x-websocket-version
Access-Control-Allow-Headers: x-websocket-protocol
Access-Control-Allow-Origin: http://example.com
Connection: Upgrade
Date: Mon, 08 May 2017 15:08:33 GMT
Sec-WebSocket-Accept: HSmrc0sMlYUkAGmm5OPpG2HaGWk=
Server: Kaazing Gateway
Upgrade: websocket


Sent:  '\x81MNow is the winter of our discontent, made glorious Summer by this son of York'

Received:  Now is the winter of our discontent, made glorious Summer by this son of York

我应该在这里说明,如果没有引入一些额外的库(如@gushitong所做的那样),这将是一个很难编写的代码。


1

根据https://www.rfc-editor.org/rfc/rfc6455#section-5.1

客户端应该对其帧进行掩码处理。(而服务器的帧则不需要进行掩码处理。)

  • 客户端必须对其发送到服务器的所有帧进行掩码处理 (有关详细信息,请参见第5.3节)。(请注意,无论WebSocket协议是否运行在TLS上,都要进行掩码处理。) 服务器必须在接收到未经过掩码处理的帧时关闭连接。 在这种情况下,服务器可以发送一个状态码为1002(协议错误)的Close帧,如第7.4.1节所定义的。 服务器不得对其发送到客户端的任何帧进行掩码处理。 如果客户端检测到掩码帧,则必须关闭连接。

这是一个可工作的版本:

import os
import array
import six
import socket
import struct

OPCODE_TEXT = 0x1

try:
    # If wsaccel is available we use compiled routines to mask data.
    from wsaccel.xormask import XorMaskerSimple
    
    def _mask(_m, _d):
        return XorMaskerSimple(_m).process(_d)

except ImportError:
    # wsaccel is not available, we rely on python implementations.
    def _mask(_m, _d):
        for i in range(len(_d)):
            _d[i] ^= _m[i % 4]

        if six.PY3:
            return _d.tobytes()
        else:
            return _d.tostring()


def get_masked(data):
    mask_key = os.urandom(4)
    if data is None:
        data = ""

    bin_mask_key = mask_key
    if isinstance(mask_key, six.text_type):
        bin_mask_key = six.b(mask_key)

    if isinstance(data, six.text_type):
        data = six.b(data)

    _m = array.array("B", bin_mask_key)
    _d = array.array("B", data)
    s = _mask(_m, _d)

    if isinstance(mask_key, six.text_type):
        mask_key = mask_key.encode('utf-8')
    return mask_key + s


def ws_encode(data="", opcode=OPCODE_TEXT, mask=1):
    if opcode == OPCODE_TEXT and isinstance(data, six.text_type):
        data = data.encode('utf-8')

    length = len(data)
    fin, rsv1, rsv2, rsv3, opcode = 1, 0, 0, 0, opcode

    frame_header = chr(fin << 7 | rsv1 << 6 | rsv2 << 5 | rsv3 << 4 | opcode)

    if length < 0x7e:
        frame_header += chr(mask << 7 | length)
        frame_header = six.b(frame_header)
    elif length < 1 << 16:
        frame_header += chr(mask << 7 | 0x7e)
        frame_header = six.b(frame_header)
        frame_header += struct.pack("!H", length)
    else:
        frame_header += chr(mask << 7 | 0x7f)
        frame_header = six.b(frame_header)
        frame_header += struct.pack("!Q", length)

    if not mask:
        return frame_header + data
    return frame_header + get_masked(data)


def ws_decode(data):
    """
    ws frame decode.
    :param data:
    :return:
    """
    _data = [ord(character) for character in data]
    length = _data[1] & 127
    index = 2
    if length < 126:
        index = 2
    if length == 126:
        index = 4
    elif length == 127:
        index = 10
    return array.array('B', _data[index:]).tostring()


# connect
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.connect((socket.gethostbyname('echo.websocket.org'), 80))

# handshake
handshake = 'GET / HTTP/1.1\r\nHost: echo.websocket.org\r\nUpgrade: websocket\r\nConnection: ' \
            'Upgrade\r\nSec-WebSocket-Key: gfhjgfhjfj\r\nOrigin: http://example.com\r\nSec-WebSocket-Protocol: ' \
            'echo\r\n' \
            'Sec-WebSocket-Version: 13\r\n\r\n'

sock.send(handshake)
print(sock.recv(1024))

sock.sendall(ws_encode(data='Hello, China!', opcode=OPCODE_TEXT))

# receive it back
response = ws_decode(sock.recv(1024))
print('--%s--' % response)

sock.close()

谢谢。现在它完美运作了。但是,我不理解一些代码部分。你能帮忙吗?比如这个:frame_header = chr(fin << 7 | rsv1 << 6 | rsv2 << 5 | rsv3 << 4 | opcode) 和这个:chr(mask << 7 | length),它们是做什么的? - yak
@yak 请查看我提供的链接中的5.2基础框架协议。 - Rolf of Saxony
@RolfofSaxony:我知道它们应该长什么样子,问题是我不理解这段代码。 - yak

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接