uWSGI Task Scheduling

Saturday, May 13, 2023

An often encountered problem when writing any server app is running tasks. There are many solutions for these types of problems. In this post I will describe some ways that uWSGI can be used to solve these problems.

Assuming that you are already using uWSGI for serving your application, by using uWSGI for task scheduling, you can potentially reduce the number of dependencies and complexity of your software. Of the more heavy-handed options for task scheduling, there are distributed task schedulers like celery. There are other possibly more lightweight solutions out there like dramatiq.

The argument I make is that if you're making use of uWSGI, or can switch to make use of uWSGI, you get an advanced and powerful piece of software by installing that dependency. So why not make use of that, rather than install or set up other software if you can avoid it?

So how does uWSGI deal with task queues and scheduling?

Spoolers and Signals

The uWSGI Spooler

The Spooler is a queue manager built into uWSGI that works like a printing/mail system.

You can enqueue massive sending of emails, image processing, video encoding, etc. and let the spooler do the hard work in background while your users get their requests served by normal workers.

The uWSGI Signal Framework

Another important task for today’s rich/advanced web apps is to respond to different events. An event could be a file modification, a new cluster node popping up, another one (sadly) dying, a timer having elapsed... whatever you can imagine.

I have put together a minimal example of how this stuff works. The rest of this post will walk through that example and highlight some of the features. The code for the example can be found here.

Doing work with signals..

def tick_handler(signum):
    print("tick..")

uwsgi.register_signal(99, "worker", tick_handler)
uwsgi.add_timer(99, 1)

First, we create a function called tick_handler that accepts signum as an argument. Then we call uwsgi.register_signal to register signal 99 with the tick_handler. This means that any time signal number 99 is called, the tick_handler function will be executed (see uwsgi.signal). Then we call the uwsgi.add_timer API, which will call signal 99 every 1 second.

This illustrates a way to get your server to do work by signalling a particular event, or by scheduling work to happen on an interval. uWSGI also provides an cron-like API for scheduling work this way (see uwsgi.add_cron)).

Doing work with spoolers..

def spool_handler(env):
    if env[b"task"] == n"send_email":
        print("sending email to: ", env[b"email"])
    return uwsgi.SPOOL_OK

uwsgi.spooler = spooler_handler

This code sets up a handler for our spooler. Any time a task gets added to the spooler, this function will be executed. The advantage of using a spooler is that the tasks are serialized and stored on filesystem. This is helpful for situations where you have tasks that need to happen but don't need to happen right away. You also gain the resiliency of having the tasks stored on filesystem. If you have a large number of tasks, you will want to make use of the spooler, rather than signal framework.

uwsgi.send_to_spooler(
    {
        b"task": b"send_email",
        b"email": b"wgwz@pm.me",
        b"at": str(time.time() + 10).encode("utf-8"),
    }
)

The snippet above shows how to enqueue work to the spooler. You call the uwsgi.send_to_spooler API with a bytes-like dictionary. In this case I also made use of the at parameter, which signifies when a particular task should be run based on UNIX time.

Configuration

There are some important values to be aware of when making use of these features. These are thoroughly documented in the uWSGI documentation. But I will point out some important things.

[uwsgi]
http = :9090
wsgi-file = app.py 
processes = 4
threads = 2
stats = :9191
stats-http
spooler = spool1
spooler-processes = 4
spooler-frequency = 1
mule

The processes config value is synonymous with workers. These processes handle web requests and in the signal example I shared earlier, these are the processes that will do the work for tick_handler. This is important to keep in mind as you are scaling your application. uWSGI also provides a concept of mule processes dedicated for doing background task work.

uwsgi.register_signal(99, "mule", tick_handler)

The above line shows how we could have enqueued work for tick_handler to a worker mule. Note that your configuration needs to specify mules (see mule, this just means to start one mule process). Mules can also provide some interesting patterns for doing server work (see "Giving a brain to mules").

The spooler_processes config value specifies how many processes are available for the spoolers to do their work. This is also an important value to keep in mind as your application grows. The spooler variable configures one spooler, whose spool files will be placed in a directory called spool1.

Demo of the example app

Again, the source code for this example is available here. Here is a demo session of the app: