An often encountered problem when writing any server app is running tasks. There are many solutions for these types of problems. In this post I will describe some ways that uWSGI can be used to solve these problems.
Assuming that you are already using uWSGI for serving your application, by using uWSGI for task scheduling, you can potentially reduce the number of dependencies and complexity of your software. Of the more heavy-handed options for task scheduling, there are distributed task schedulers like celery. There are other possibly more lightweight solutions out there like dramatiq.
The argument I make is that if you're making use of uWSGI, or can switch to make use of uWSGI, you get an advanced and powerful piece of software by installing that dependency. So why not make use of that, rather than install or set up other software if you can avoid it?
So how does uWSGI deal with task queues and scheduling?
The Spooler is a queue manager built into uWSGI that works like a printing/mail system.
You can enqueue massive sending of emails, image processing, video encoding, etc. and let the spooler do the hard work in background while your users get their requests served by normal workers.
Another important task for today’s rich/advanced web apps is to respond to different events. An event could be a file modification, a new cluster node popping up, another one (sadly) dying, a timer having elapsed... whatever you can imagine.
I have put together a minimal example of how this stuff works. The rest of this post will walk through that example and highlight some of the features. The code for the example can be found here.
def tick_handler(signum):
print("tick..")
uwsgi.register_signal(99, "worker", tick_handler)
uwsgi.add_timer(99, 1)
First, we create a function called tick_handler
that accepts signum
as an argument. Then we call uwsgi.register_signal
to register signal
99 with the tick_handler
. This means that any time signal number 99 is
called, the tick_handler
function will be executed (see
uwsgi.signal
). Then we call the uwsgi.add_timer
API, which will
call signal 99 every 1 second.
This illustrates a way to get your server to do work by signalling a
particular event, or by scheduling work to happen on an interval. uWSGI
also provides an cron-like API for scheduling work this way (see
uwsgi.add_cron
)).
def spool_handler(env):
if env[b"task"] == n"send_email":
print("sending email to: ", env[b"email"])
return uwsgi.SPOOL_OK
uwsgi.spooler = spooler_handler
This code sets up a handler for our spooler. Any time a task gets added to the spooler, this function will be executed. The advantage of using a spooler is that the tasks are serialized and stored on filesystem. This is helpful for situations where you have tasks that need to happen but don't need to happen right away. You also gain the resiliency of having the tasks stored on filesystem. If you have a large number of tasks, you will want to make use of the spooler, rather than signal framework.
uwsgi.send_to_spooler(
{
b"task": b"send_email",
b"email": b"wgwz@pm.me",
b"at": str(time.time() + 10).encode("utf-8"),
}
)
The snippet above shows how to enqueue work to the spooler. You call the
uwsgi.send_to_spooler
API with a bytes-like dictionary. In this
case I also made use of the at
parameter, which signifies when a
particular task should be run based on UNIX time.
There are some important values to be aware of when making use of these features. These are thoroughly documented in the uWSGI documentation. But I will point out some important things.
[uwsgi]
http = :9090
wsgi-file = app.py
processes = 4
threads = 2
stats = :9191
stats-http
spooler = spool1
spooler-processes = 4
spooler-frequency = 1
mule
The processes
config value is synonymous with workers
. These
processes
handle web requests and in the signal example I shared
earlier, these are the processes that will do the work for
tick_handler
. This is important to keep in mind as you are scaling
your application. uWSGI also provides a concept of mule processes
dedicated for doing background task work.
uwsgi.register_signal(99, "mule", tick_handler)
The above line shows how we could have enqueued work for tick_handler
to a worker mule. Note that your configuration needs to specify mules
(see mule
, this just means to start one mule process). Mules can also
provide some interesting patterns for doing server work (see "Giving a
brain to mules").
The spooler_processes
config value specifies how many processes are
available for the spoolers to do their work. This is also an important
value to keep in mind as your application grows. The spooler
variable
configures one spooler, whose spool files will be placed in a directory
called spool1
.
Again, the source code for this example is available here. Here is a demo session of the app: