Python Async异步编程：以Tornado为例附生产者消费者模式

开发运维 2023-10-09 泡泡手机阅读

本文首发于我的个人博客(21年)：高性能Tornado处理逻辑实现

随着fastapi和其生态环境的飞速发展，文中的例子稍作修改（甚至不做修改）即可运用于使用uvicorn的fastapi框架中

前言

大多数人都知道Tornado是一个协程异步框架，但是大多数人都没有很好的理解协程编程的相关原理，网上也缺乏相关的教程，往往浅尝辄止。

这篇文章将试着从盘古开天说起，将一个hello world服务器变成一个海量吞吐服务器，适合协程编程入门的新手，对协程有兴趣，但是对协程原理一知半解的同学阅读；也适合使用Django等线程模型服务器的开发同学了解Tornado是如何同时获得协程和多进程优势的。

TL;DR;你可以直接跳到最后面的生产者消费者模型阅读代码，省去前面的简单内容。

当然，Tornado多进程模式需要依赖fork函数，在windows上是行不通的，但这并不意味着本篇文章的代码都无法运行，相反，你只需要注释掉 http_server.start(0)，就可以运行本篇文章的所有代码。在最终版本中，本文实现了一个全异步的服务，即使你无法启动多进程的Tornado，相信我，这也不会成为你的性能瓶颈！

完成这篇文章主要靠笔者的阅读经验和实际项目经验，《流畅的Python》一书对如何改造线程模型为协程模型有详细的介绍，如果想要深入学习Python，建议阅读此书。本文借用其原则：从某个函数进行改造时，首先将其定义为 async的，其次将其中的耗时操作利用 run_in_executor封装，最后层层改进其调用函数，使用 await调用，在这里听不懂没关系，后面会有实际讲解

从Hello World!开始

首先我们从Hello world开始，稍稍修改了官方给出的例子，得到了一个接受不是很规范的get实现，解析请求的 json body对象，从中读取 job_id并输出的服务

import logging

import tornado.httpserver
import tornado.ioloop
import tornado.options
import tornado.web
import json

from tornado.options import define, options

define("port", default=8888, help="run on the given port", type=int)


def get_logger():
    return logging.getLogger("tornado.general")


class MainHandler(tornado.web.RequestHandler):
    logger = get_logger()

    def get(self):
        job_id = json.loads(self.request.body.decode()).get('job_id')
        self.do_something(job_id)
        self.write(f"{job_id} done")

    def do_something(self, job_id):
        self.logger.info(f'do job:{job_id}')


def main():
    tornado.options.parse_command_line()
    application = tornado.web.Application([(r"/", MainHandler)])
    http_server = tornado.httpserver.HTTPServer(application)
    http_server.listen(options.port)
    # 多进程
    # http_server.start(0)
    tornado.ioloop.IOLoop.current().start()


if __name__ == "__main__":
    main()

启动后，他会监听你的8888端口

写个简单的请求脚本吧

下面我们简单的写个请求脚本验证一下

import requests
import json

from requests import Timeout

def api_request(job_id):
    try:
        response = requests.get('http://localhost:8888', data=json.dumps({'job_id': job_id}), timeout=3) # 特意设置了3秒超时
    except Timeout:
        return False
    return response.status_code == 200
if __name__ == '__main__':
    api_request(1)

运行这个脚本，你就向你的服务器发送了一个get请求啦

之后我们会针对这个脚本进行扩展，以达到并发测试的目的~

如果任务时间比较长怎么办？

实际的开发中，不可能简单的打个日志就结束，万一是一个需要一些时间（比如1s？0.5s？）才能完成的任务，那会发生什么呢？

... 
#省略了一些你已经知道的依赖引入
import time

class MainHandler(tornado.web.RequestHandler):
    logger = get_logger()

    def get(self):
        job_id = json.loads(self.request.body.decode()).get('job_id')
        self.do_something(job_id)
        self.write(f"{job_id} done")

    def do_something(self, job_id):
        # 模拟一个任务需要1秒的时间完成
        time.sleep(1) 
        self.logger.info(f'job done:{job_id}')
... 
#省略了服务器启动的代码

在以上的代码中，time.sleep(1)将阻塞服务器，这并不意味着无法建立连接，但是会导致已经建立的连接无法收到消息，形成 ReadTimeout

模拟并发测试

让我们把之前的请求脚本改一改，变为一个并发测试的脚本

import requests
import json
from multiprocessing import Pool

from requests import Timeout

# 总请求数
REQUEST_NUM = 10
# 进程数
PROCESSOR_NUM = 10


def api_request(job_id):
    try:
        response = requests.get('http://localhost:8888', data=json.dumps({'job_id': job_id}), timeout=3)
    except Timeout:
        return False
    return response.status_code == 200


if __name__ == '__main__':
    with Pool(PROCESSOR_NUM) as p:
        result = p.map(api_request, range(REQUEST_NUM))
    succeed = result.count(True)
    failed = result.count(False)
    print(f"{succeed / (failed + succeed) * 100}% request success!")

运行脚本，你会得到以下输出：

20.0% request success!

而在服务器端的日志，你会看到实际上是有10个请求的

[I 211217 23:45:52 hello_world:29] job done:0
[I 211217 23:45:52 web:2239] 200 GET / (::1) 1012.00ms
[I 211217 23:45:53 hello_world:29] job done:1
[I 211217 23:45:53 web:2239] 200 GET / (::1) 2025.00ms
[I 211217 23:45:54 hello_world:29] job done:2
[I 211217 23:45:54 web:2239] 200 GET / (::1) 3032.00ms
[I 211217 23:45:55 hello_world:29] job done:3
[I 211217 23:45:55 web:2239] 200 GET / (::1) 1009.00ms
[I 211217 23:45:56 hello_world:29] job done:4
[I 211217 23:45:56 web:2239] 200 GET / (::1) 2020.00ms
[I 211217 23:45:57 hello_world:29] job done:5
[I 211217 23:45:57 web:2239] 200 GET / (::1) 3029.00ms
[I 211217 23:45:58 hello_world:29] job done:6
[I 211217 23:45:58 web:2239] 200 GET / (::1) 4044.00ms
[I 211217 23:45:59 hello_world:29] job done:8
[I 211217 23:45:59 web:2239] 200 GET / (::1) 5056.00ms
[I 211217 23:46:00 hello_world:29] job done:7
[I 211217 23:46:00 web:2239] 200 GET / (::1) 6068.00ms
[I 211217 23:46:01 hello_world:29] job done:9
[I 211217 23:46:01 web:2239] 200 GET / (::1) 7076.00ms

这说明，这十个连接被Tornado”接住“了，建立了连接，但是client端设置了超时时间，超时后client端断开了连接，而从根据3秒超时，1秒处理时间来看，只有前两个请求有可能完成，第三个请求大概率超时，第三个之后的请求根本不用想，必定超时

使用多进程接收请求

接触过django、flask这类线程模型的web框架的你可能会想到使用多线程或者多进程来处理，Tornado作为协程框架，提供有多进程的接口，只需要打开 http_server.start(0)注释，你就会得到多进程的Tornado服务

def main():
    tornado.options.parse_command_line()
    application = tornado.web.Application([(r"/", MainHandler)])
    http_server = tornado.httpserver.HTTPServer(application)
    http_server.listen(options.port)
    # 多进程，根据你的CPU核数决定
    http_server.start(0)
    tornado.ioloop.IOLoop.current().start()

然后我们再进行一次请求，得到的服务器日志如下：

[I 211218 00:03:09 process:123] Starting 8 processes
[I 211218 00:03:12 hello:29] job done:0
[I 211218 00:03:12 web:2239] 200 GET / (127.0.0.1) 1002.04ms
[I 211218 00:03:12 hello:29] job done:5
[I 211218 00:03:12 web:2239] 200 GET / (127.0.0.1) 1002.93ms
[I 211218 00:03:12 hello:29] job done:8
[I 211218 00:03:12 hello:29] job done:4
[I 211218 00:03:12 web:2239] 200 GET / (127.0.0.1) 1002.39ms
[I 211218 00:03:12 web:2239] 200 GET / (127.0.0.1) 1002.52ms
[I 211218 00:03:12 hello:29] job done:9
[I 211218 00:03:12 hello:29] job done:7
[I 211218 00:03:12 web:2239] 200 GET / (127.0.0.1) 1002.65ms
[I 211218 00:03:12 web:2239] 200 GET / (127.0.0.1) 1002.69ms
[I 211218 00:03:12 hello:29] job done:6
[I 211218 00:03:12 web:2239] 200 GET / (127.0.0.1) 1002.06ms
[I 211218 00:03:13 hello:29] job done:2
[I 211218 00:03:13 web:2239] 200 GET / (127.0.0.1) 2002.96ms
[I 211218 00:03:14 hello:29] job done:1
[I 211218 00:03:14 web:2239] 200 GET / (127.0.0.1) 3005.87ms
[I 211218 00:03:15 hello:29] job done:3
[I 211218 00:03:15 web:2239] 200 GET / (127.0.0.1) 4007.62ms

这次所有请求都成功了！（如果有失败的，你当然可以多试几次）

100.0% request success!

当然，这种简单粗暴的方式有其缺点：

资源消耗大，每个连接都需要一个进程保持

stupid，就像知乎上的入门教程一样，没听说过什么是协程

多进程需要考虑竞争，加锁，可能1核有难7核围观

像个聪明人：使用协程

实际上，我们只要使用Tornado的异步特性，不需要多进程，就可以搞定这个问题

我把解决问题的步骤都标注在注释里，希望你能理解自底向上异步改造的流程

# 0.异步实现的库
import asyncio 
class MainHandler(tornado.web.RequestHandler):
    logger = get_logger()

    # 4.把这个函数也变成异步的，然后继续向上变更，上级Tornado Handler支持异步的get请求，修改到此为止
    async def get(self): 
        job_id = json.loads(self.request.body.decode()).get('job_id')
        # 3.把do_something的调用变成异步调用
        await  self.do_something(job_id) 
        self.write(f"{job_id} done")

    # 2.把这个函数编程异步的
    async def do_something(self, job_id): 
        # 1.使用异步实现的库替换耗时操作
        await asyncio.sleep(1)  
        self.logger.info(f'job done:{job_id}')

现在让我们注释掉多进程模式

# 关掉多进程，像个男子汉
# http_server.start(0)

然后测试一下刚才的脚本

[I 211218 00:08:17 hello:29] job done:2
[I 211218 00:08:17 web:2239] 200 GET / (127.0.0.1) 1003.32ms
[I 211218 00:08:17 hello:29] job done:0
[I 211218 00:08:17 web:2239] 200 GET / (127.0.0.1) 1002.62ms
[I 211218 00:08:17 hello:29] job done:3
[I 211218 00:08:17 web:2239] 200 GET / (127.0.0.1) 1002.47ms
[I 211218 00:08:17 hello:29] job done:1
[I 211218 00:08:17 web:2239] 200 GET / (127.0.0.1) 1002.51ms
[I 211218 00:08:17 hello:29] job done:4
[I 211218 00:08:17 web:2239] 200 GET / (127.0.0.1) 1002.49ms
[I 211218 00:08:17 hello:29] job done:7
[I 211218 00:08:17 web:2239] 200 GET / (127.0.0.1) 1002.37ms
[I 211218 00:08:17 hello:29] job done:8
[I 211218 00:08:17 web:2239] 200 GET / (127.0.0.1) 1002.94ms
[I 211218 00:08:17 hello:29] job done:6
[I 211218 00:08:17 web:2239] 200 GET / (127.0.0.1) 1002.41ms
[I 211218 00:08:17 hello:29] job done:9
[I 211218 00:08:17 web:2239] 200 GET / (127.0.0.1) 1003.31ms
[I 211218 00:08:17 hello:29] job done:5
[I 211218 00:08:17 web:2239] 200 GET / (127.0.0.1) 1003.20ms

芜湖，所有连接都在1秒左右搞定了！

协程的原理

对于初学者而言，你首先需要了解用户级线程和内核级线程的区别。协程实际上是一个单进程单线程模型，对于内核而言，它是1而非N，协程程序自己控制各个协程之间的运行顺序，这就是用户级线程。不谈内核是如何调度线程的，对于协程而言，每个 await都代表着让出程序控制（让出CPU），并将结果加入到等待队列，协程调度器将从等待队列中找到一个已经完成的任务，恢复其上下文环境，让这个任务能够继续执行下去。在本例中，1秒之后，asyncio.sleep(1)的任务完成了，这时如果有好心人能够让出CPU（调用 await），那么原来暂停的程序就有可能被选中，得以继续完成。

协程就是这样，在单线程中循环搜索那些已经完成的任务并加以推进，同时等待、管理那些未完成的任务

这样一说，希望你能理解 IOLoop中 Loop这四个字母的含义

协程的问题

你也看到了，协程最重要的是等待任务完成，但没有告诉我们任务如何完成

如果任务是一个网络请求，那么等待他完成是一件挺不错的事，但如果任务是打印一行日志，那么等待他完成就显得有点蠢

其实对于程序员来说，最重要的事有库可以异步地做事

否则，你就得参考下一章，使用executor封装了

使用executor封装协程

如何在不耗费CPU的情况下做一件耗费CPU的事？这本身就是一个悖论。

因此，对于一些需要计算，或者没有异步实现的任务来说，想要像 asyncio.sleep()一样轻松异步执行是做不到的，这就需要我们借助线程或进程的力量（当然，线程安全就是避不开的话题）。

首先，让我们假装忘记 sleep的异步实现，换回 time.sleep()然后你就会发现，async并不能让你获得异步的能力，而是像普通函数一样卡死在这里

import time
class MainHandler(tornado.web.RequestHandler):
    async def do_something(self, job_id):
        # 哦，我们在异步函数里写了一个长阻塞，这太糟了
        time.sleep(1)
        self.logger.info(f'job done:{job_id}')

有两种方案可以搞定，一种是Tornado提供的装饰器，有点偷懒但是好用，run_on_executor装饰器将自动地把同步函数（do_something）放进 self.executor执行，并把它封装成一个 async函数（其实称为 awaitable对象比较好）

from tornado.concurrent import run_on_executor
from concurrent.futures import ThreadPoolExecutor


class MainHandler(tornado.web.RequestHandler):
    logger = get_logger()
    executor = ThreadPoolExecutor(20)

    async def get(self):
        job_id = json.loads(self.request.body.decode()).get('job_id')
        await self.do_something(job_id)
        self.write(f"{job_id} done")

    # 没关系，我们把它放在executor里执行就好了
    # 注意：这里改成了同步函数
    @run_on_executor 
    def do_something(self, job_id): 
        # 哦，这会阻塞服务器！
        time.sleep(1) 
        self.logger.info(f'job done:{job_id}')

另一种是常见的异步写法，是标准 IOLoop支持的

from tornado import ioloop
from concurrent.futures import ThreadPoolExecutor
class MainHandler(tornado.web.RequestHandler):
    logger = get_logger()
    executor = ThreadPoolExecutor(20)

    async def get(self):
        job_id = json.loads(self.request.body.decode()).get('job_id')
        # 没关系，我们把它放在executor里执行就好了
        await ioloop.IOLoop.current().run_in_executor(self.executor, self.do_something, job_id)
        self.write(f"{job_id} done")

     # 注意：这里改成了同步函数
    def do_something(self, job_id):
        # 哦，这会阻塞服务器！
        time.sleep(1)
        self.logger.info(f'job done:{job_id}')

实践中，由于GIL锁限制，线程并不能发挥机器地全部实力，在CPU密集时推荐将 ThreadPoolExecutor改为 ProcessPoolExecutor，但是由于 pickle不能封装自定义类发送给子进程执行，所以需要把CPU密集型操作单独写成一个函数，这里用第二种方式做示范，因为第二种方式更通用，也更好写

from concurrent.futures import ProcessPoolExecutor

def real_work():
    time.sleep(1)


class MainHandler(tornado.web.RequestHandler):
    logger = get_logger()
    executor = ProcessPoolExecutor(20)

    async def get(self):
        job_id = json.loads(self.request.body.decode()).get('job_id')
        await self.do_something(job_id)
        self.write(f"{job_id} done")

    async def do_something(self, job_id):
        #根据最小原则封装
        await ioloop.IOLoop.current().run_in_executor(self.executor, real_work)
        self.logger.info(f'job done:{job_id}')

好了，经过以上操作，我们已经明白了如何封装同步为异步，化腐朽为神奇，有一点千万记住，在协程中，任何阻塞都有可能是致命的！任何executor封装的操作都需要是线程安全的！

以及，仔细分析压力点，是流量顶不住还是计算太慢，如果是前者，就采用Tornado多进程模式，如果是后者，就使用 executor承压

最后，executor的承载数量是有限的，你可以尝试调大测试脚本并发数量，看是否还能保持之前的成功率

# 总请求数
REQUEST_NUM=3000
# 100并发
PROCESSOR_NUM=100

空间换时间：生产者消费者模型

在上一章中，如果你确实调大了并发量和请求数，你就会发现，在服务器可用线程被耗尽的情况下（当然你可以设置几百上千个），你的连接仍然会失败。其实在任何web框架中都是一样的，资源耗尽就只有死路一条。对于这类任务，程序员应该提前预判到，并将其转换为异步任务，利用生产者消费者模型对请求进行处理。

接下来我将展示一个异步模式下的消费者模型，利用 IOLoop.add_callback()函数，将消费者的消费函数注册为任务，同时依靠 ThreadPoolExecutor执行阻塞操作

async def async_period_job(func, period):
    while True:
        try:
            await func()
        except Exception as e:
            get_logger().exception(e)
        await asyncio.sleep(period)


# 定时任务的常用写法
start_period_job = functools.partial(ioloop.IOLoop.current().add_callback, async_period_job)


def sleep(t, job_id):
    # 耗时操作 兼容ProcessPoolExecutor
    # 如果使用自定义的ORM类，使用ThreadPoolExecutor就足够了，不需要大费周章将其写成外部函数
    time.sleep(t)
    return job_id


class SimpleConsumer:
    def __init__(self):
        self.buffer = Queue()
        self.futures = Queue()
        self.started = False
        self.log = get_logger()
        # 2秒收集一次结果
        self.collect_period = 2
        # 这里控制消费者数量
        self.executor = PoolExecutor(20)

    async def put(self, job_id):
        await self.buffer.put(job_id)

    async def real_work(self):
        # 由于pickle的原因，不能放在executor里
        data = await self.buffer.get()
        self.log.info(f'scheduling {data}')
        future = self.executor.submit(sleep, 1, data)
        return future, data

    def start(self):
        if self.started:
            return
        self.started = True
        self.log.info('Consumer Started!')
        ioloop.IOLoop.current().add_callback(self._start_real_work)
        start_period_job(self._collect, self.collect_period)

    async def _collect(self):
        # 定时收集结果
        not_done = Queue()
        while not self.futures.empty():
            future, data = await self.futures.get()
            if future.done():
                try:
                    result = future.result()
                    await self.collect_result(result)
                except Exception as e:
                    self.log.exception(e)
                    await self.buffer.put(data)  # retry
            else:
                await not_done.put((future, data))
        while not not_done.empty():
            await self.futures.put(await not_done.get())

    async def _start_real_work(self):
        while True:
            future, data = await self.real_work()
            await self.futures.put((future, data))

    async def collect_result(self, result):
        self.log.info(f'Collected job: {result}')


# 最简单的单例
consumer = SimpleConsumer()  
consumer.start()


class MainHandler(tornado.web.RequestHandler):
    logger = get_logger()
    consumer = consumer

    async def get(self):
        job_id = json.loads(self.request.body.decode()).get('job_id')
        await self.do_something(job_id)
        self.write(f"{job_id} add")

    async def do_something(self, job_id):
        await self.consumer.put(job_id)
        self.logger.info(f'job add:{job_id}')

使用并发脚本测试，将得到类似以下日志：

[I 211218 02:14:10 hello:112] job add:2
[I 211218 02:14:10 web:2239] 200 GET / (127.0.0.1) 1.54ms
[I 211218 02:14:10 hello:112] job add:1
[I 211218 02:14:10 web:2239] 200 GET / (127.0.0.1) 1.53ms
[I 211218 02:14:10 hello:112] job add:3
[I 211218 02:14:10 web:2239] 200 GET / (127.0.0.1) 1.58ms
[I 211218 02:14:10 hello:112] job add:5
[I 211218 02:14:10 web:2239] 200 GET / (127.0.0.1) 1.67ms
[I 211218 02:14:10 hello:112] job add:4
[I 211218 02:14:10 web:2239] 200 GET / (127.0.0.1) 1.73ms
[I 211218 02:14:10 hello:112] job add:0
[I 211218 02:14:10 web:2239] 200 GET / (127.0.0.1) 1.76ms
[I 211218 02:14:10 hello:59] scheduling 2
[I 211218 02:14:10 hello:59] scheduling 1
[I 211218 02:14:10 hello:59] scheduling 3
[I 211218 02:14:10 hello:59] scheduling 5
[I 211218 02:14:10 hello:59] scheduling 4
[I 211218 02:14:10 hello:59] scheduling 0
[I 211218 02:14:10 hello:112] job add:8
[I 211218 02:14:10 web:2239] 200 GET / (127.0.0.1) 0.70ms
[I 211218 02:14:10 hello:112] job add:7
[I 211218 02:14:10 web:2239] 200 GET / (127.0.0.1) 0.69ms
[I 211218 02:14:10 hello:112] job add:9
[I 211218 02:14:10 web:2239] 200 GET / (127.0.0.1) 0.74ms
[I 211218 02:14:10 hello:59] scheduling 8
[I 211218 02:14:10 hello:59] scheduling 7
[I 211218 02:14:10 hello:59] scheduling 9
[I 211218 02:14:10 hello:112] job add:6
[I 211218 02:14:10 web:2239] 200 GET / (127.0.0.1) 1.15ms
[I 211218 02:14:10 hello:59] scheduling 6
[I 211218 02:14:13 hello:94] Collected job: 2
[I 211218 02:14:13 hello:94] Collected job: 1
[I 211218 02:14:13 hello:94] Collected job: 3
[I 211218 02:14:13 hello:94] Collected job: 5
[I 211218 02:14:13 hello:94] Collected job: 4
[I 211218 02:14:13 hello:94] Collected job: 0
[I 211218 02:14:13 hello:94] Collected job: 8
[I 211218 02:14:13 hello:94] Collected job: 7
[I 211218 02:14:13 hello:94] Collected job: 9
[I 211218 02:14:13 hello:94] Collected job: 6

我们会看到，任务在一边被加入队列，一边进行，就如我前面所说，协程会在 await的时候释放CPU并切换到准备好的协程继续执行，这里体现为忙时一直在接收请求，闲时对buffer里的内容进行处理。

通过这种方式，我们可以通过Tornado的多进程模式轻松拓展生产者，通过空间换时间，保证请求不会失败，将任务轻松转换为后台任务，通过控制 PoolExecutor的 worker数量，控制消费者数量，达到性能平衡

和刚才用 ProcessPoolExecutor一样，在CPU密集的情况下，多进程消费者显然更具优势

from concurrent.futures import ProcessPoolExecutor as PoolExecutor

现在请你试着加大总请求数，看看效果，你会发现在加入了队列之后，即使是单线程也可以瞬间搞定所有请求，这是生产者消费者模型给我们带来的便利。

REQUEST_NUM = 1000

总结

本文介绍了如何利用Tornado的异步特性，打造高性能Tornado服务器，有几点需要课后复习

自底向上的异步改造

使用 executor封装异步操作

生产者消费者模型的异步实现

还有几点需要额外注意：

使用 executor时的线程安全问题

ProcessPoolExecutor的 pickle问题

executor是有并行数量限制的

附录

生产者消费者完整代码

import asyncio
import functools
import logging
import time
from asyncio import Queue
from concurrent.futures import ThreadPoolExecutor as PoolExecutor

import tornado.httpserver
import tornado.ioloop
import tornado.options
import tornado.web
import json

from tornado import ioloop
from tornado.concurrent import run_on_executor
from tornado.options import define, options

define("port", default=8888, help="run on the given port", type=int)


def get_logger():
    return logging.getLogger('tornado.general')


async def async_period_job(func, period):
    while True:
        try:
            await func()
        except Exception as e:
            get_logger().exception(e)
        await asyncio.sleep(period)


# 定时任务的常用写法
start_period_job = functools.partial(ioloop.IOLoop.current().add_callback, async_period_job)


def sleep(t, job_id):
    # 耗时操作 兼容ProcessPoolExecutor
    # 如果使用自定义的ORM类，使用ThreadPoolExecutor就足够了，不需要大费周章将其写成外部函数
    time.sleep(t)
    return job_id


class SimpleConsumer:
    def __init__(self):
        self.buffer = Queue()
        self.futures = Queue()
        self.started = False
        self.log = get_logger()
        self.collect_period = 2
        self.executor = PoolExecutor(20)

    async def put(self, job_id):
        await self.buffer.put(job_id)

    async def real_work(self):
        data = await self.buffer.get()
        self.log.info(f'scheduling {data}')
        future = self.executor.submit(sleep, 1, data)
        return future, data

    def start(self):
        if self.started:
            return
        self.started = True
        self.log.info('Consumer Started!')
        ioloop.IOLoop.current().add_callback(self._start_real_work)
        start_period_job(self._collect, self.collect_period)

    async def _collect(self):
        # 收集结果，应当在定时任务里面做
        not_done = Queue()
        while not self.futures.empty():
            future, data = await self.futures.get()
            if future.done():
                try:
                    result = future.result()
                    await self.collect_result(result)
                except Exception as e:
                    self.log.exception(e)
                    await self.buffer.put(data)  # retry
            else:
                await not_done.put((future, data))
        while not not_done.empty():
            await self.futures.put(await not_done.get())

    async def _start_real_work(self):
        while True:
            future, data = await self.real_work()
            await self.futures.put((future, data))

    async def collect_result(self, result):
        self.log.info(f'Collected job: {result}')


consumer = SimpleConsumer()
consumer.start()


class MainHandler(tornado.web.RequestHandler):
    logger = get_logger()
    consumer = consumer

    async def get(self):
        job_id = json.loads(self.request.body.decode()).get('job_id')
        await self.do_something(job_id)
        self.write(f"{job_id} add")

    async def do_something(self, job_id):
        await self.consumer.put(job_id)
        self.logger.info(f'job add:{job_id}')


def main():
    tornado.options.parse_command_line()
    application = tornado.web.Application([(r"/", MainHandler)])
    http_server = tornado.httpserver.HTTPServer(application)
    http_server.listen(options.port)
    # http_server.start(0)
    tornado.ioloop.IOLoop.current().start()


if __name__ == "__main__":
    main()

测试脚本代码

import requests
import json
from multiprocessing import Pool

from requests import Timeout

REQUEST_NUM = 1000
PROCESSOR_NUM = 10


def api_request(job_id):
    try:
        response = requests.get('http://localhost:8888', data=json.dumps({'job_id': job_id}), timeout=3)
    except Timeout:
        return False
    return response.status_code == 200


if __name__ == '__main__':
    with Pool(PROCESSOR_NUM) as p:
        result = p.map(api_request, range(REQUEST_NUM))
    succeed = result.count(True)
    failed = result.count(False)
    print(f"{succeed / (failed + succeed) * 100}% request success!")