Skip to content

http server occasionally experiences high latency #3419

@shaxiaozz

Description

@shaxiaozz

1、Environment Information
Tornado Version: TornadoServer/6.1
Python Version: 3.7
Docker Version: 20.10.7

My Tornado Web Server runs in K8s, so the traffic is forwarded by Envoy. The client uses HTTP1.1 requests by default, so HTTP Keep-Alive is used between Envoy and Tornado Pod
Envoy--->Tornado Pod

Tornado startup sample code is as follows:
Tornado Pod cpu limit=4

if __name__ == "__main__":
    if conf.IS_BACKUP:
        application = Application([
            (r"/(\w+)", TestHandlerBase)], **settings)
    else:
        application = Application([
            (r"/(\w+)", TestHandlerBase)])

    num_processes = conf.TORNADO_PROCESS_NUM
    _initialization(num_processes)

    server = httpserver.HTTPServer(application)
    server.bind('9880', '0.0.0.0', backlog=2048)
    server.start(num_processes)

    skywalking_agent()
    print ('start service...')
    ioloop.IOLoop.instance().start()

2、Problems encountered
When I performed stress testing on the Tornado interface, the original interface response delay should be within 100ms, but after 1 minute of stress testing, the interface response delay rose to more than 300ms, which is intolerable for our business.

We used tcpdump to capture packets and check the communication, and found the following situation:
172.29.222.1 is the envoy ip, 172.29.86.34 is the tornado web pod ip

Serial number 20498: envoy forwards the request to tornado pod (time: 11:17:57.3055)
Serial number 20499: tornado pod responds with ACK (time: 11:17:57.3056)
Serial number 22235: tornado pod responds with http data (time: 11:17:57.799)
The entire HTTP response time is 494ms
image
The request entered the tornado pod at 11:17:57.3056, but we observed the tornado access log and found that tornado actually started processing the request at 11:17:57.751, and completed the processing at 11:17:57.799, sending an http response
image

We don't know where the time is spent. At the same time, is there any tool to observe the time consumption of this part?

I hope to get your reply. This problem has troubled us for a long time. Thank you~

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions