Hosting multiple Flask apps using Apache/mod_wsgi | Oxford Protein Informatics Group

A common way of deploying a Flask web application in a production environment is to use an Apache server with the mod_wsgi module, which allows Apache to host any application that supports Python’s Web Server Gateway Interface (WSGI), making it quick and easy to get an application up and running. In this post, we’ll go through configuring your Apache server to host multiple Python apps in a stable manner, including how to run apps in daemon mode and avoiding hanging processes due to Python C extensions not working well with Python sub-interpreters (I’m looking at you, numpy).

Basic Configuration

First, let’s set up a simple Apache configuration file to host a single app. Suppose we have an app with a wsgi script located at /path/to/myapp/wsgi/main.wsgi. A minimal virtual host configuration file to let Apache host this app might look like:

<VirtualHost *:80>

    ServerName mysite.com
    ServerAlias www.mysite.com

    WSGIScriptAlias /myapp /path/to/myapp/wsgi/main.wsgi
    <Directory /path/to/myapp/wsgi>
        Require all granted
    </Directory>

</VirtualHost>

The WSGIScriptAlias directive does two things. The first argument is the URL path to serve the application from, relative to the root URL, and the second is the path to the wsgi script of the app we want to serve. This tells Apache that /path/to/myapp/main.wsgi is a wsgi script, and that it should host this app at mysite.com/myapp. Defining an alias like this is a useful security feature, as it avoids exposing your directory structure to an external user.

The <Directory> directive is used to grant Apache the necessary permissions to access the wsgi script located in the directory /path/to/myapp/wsgi.

Hosting multiple applications

With this configuration, myapp should now be hosted at mysite.com/myapp. So far, so good. But what if we want to host more than one Flask app? One solution, if you don’t mind juggling multiple sites, is to simply use additional virtual hosts:

<VirtualHost *:80>

    ServerName site-1.mysite.com
    ServerAlias www.site-1.mysite.com

    WSGIScriptAlias /myapp1 /path/to/myapp1/wsgi/main.wsgi
    <Directory /path/to/myapp1/wsgi>
        Require all granted
    </Directory>

</VirtualHost>

<VirtualHost *:80>

    ServerName site-2.mysite.com
    ServerAlias www.site-2.mysite.com

    WSGIScriptAlias /myapp2 /path/to/myapp2/wsgi/main.wsgi
    <Directory /path/to/myapp2/wsgi>
        Require all granted
    </Directory>

</VirtualHost>

Note that we’ve defined a ServerAlias for each virtual host. If a host name maps onto the server IP, but does not match the name of any virtual hosts, it will be handled by the first virtualhost Apache finds when reading the config file. In this case, if we omitted the ServerAlias directive from the second virtual host, a request for www.site-2.mysite.com would actually end up being handled by the first virtual host.

Suppose however we want to host our apps on a single site using a single virtual host. The naive approach here is to simply include multiple WSGIScriptAlias directives under a single virtual host, one for each app:

<VirtualHost *:80>

    ServerName mysite.com
    ServerAlias www.mysite.com

    WSGIScriptAlias /myapp1 /path/to/myapp1/wsgi/main.wsgi
    <Directory /path/to/myapp1/wsgi>
        Require all granted
    </Directory>

    WSGIScriptAlias /myapp2 /path/to/myapp2/wsgi/main.wsgi
    <Directory /path/to/myapp2/wsgi>
        Require all granted
    </Directory>

</VirtualHost>

This will let Apache host our apps at mysite.com/myapp1 and mysite.com/myapp2 respectively.

Running applications in daemon mode

By default Apache runs our apps as child processes using mod_wsgi embedded mode. This is fine if we have a single app and can configure Apache to host it exactly the way we want, but starts to become problematic when we have multiple apps or we want to use Apache’s default configuration, as we have no control over the resorces used when Apache creates each child process, and have to restart the entire server to reload code whenever we modify an app. It can also get messy when we want to install the dependencies for each app into a different Python virtual environment, as the ordering of the WSGIScriptAlias directives influences the process in which an app is run.

To get around this, we can instead run our apps in ‘daemon mode’. In daemon mode, we define a set of processes which are created solely for the purpose of running WSGI applications, and automatically hand any WSGI applications to these processes. In addition to allowing us to define the number of processes and threads available to each application, this also allows us to reload the code for an application by simply touching the application’s wsgi script.

We can set up a daemon process group for our virtual host using:

WSGIDaemonProcess myproc processes=2 threads=15
WSGIProcessGroup myproc

WSGIDaemonProcess defines a group of daemon processes called ‘mysite.com’. In this case, the group contains two daemon processes that are allowed to create up to 15 threads to handle requests. The WSGIProcessGroup directive tells Apache that any WSGI processes in this context should be handled by the specified daemon process group.

One of the useful things we can do when creating a daemon process group is specifying a Python virtual environment to be used by processes in the group. For example, to use a virtual environment located at /path/to/myenv/venv:

WSGIDaemonProcess myproc processes=2 threads=15 python-home=/path/to/myenv/venv
WSGIProcessGroup myproc

The full virtual host configuration would look like:

<VirtualHost *:80>

    ServerName mysite.com
    ServerAlias www.mysite.com

    WSGIDaemonProcess myproc processes=2 threads=15 python-home=/path/to/myenv/venv
    WSGIProcessGroup myproc

    WSGIScriptAlias /myapp1 /path/to/myapp1/wsgi/main.wsgi
    <Directory /path/to/myapp1/wsgi>
        Require all granted
    </Directory>

    WSGIScriptAlias /myapp2 /path/to/myapp2/wsgi/main.wsgi
    <Directory /path/to/myapp2/wsgi>
        Require all granted
    </Directory>

</VirtualHost>

Now, both myapp1 and myapp2 will be run as daemon processes, using the Python virtual environment located at /path/to/myenv/venv, and we can reload the code for each app by simply touching the WSGI scripts located at /path/to/myapp*/wsgi/main.wsgi without interrupting the other app.

Using different process groups for each application

Next, suppose we want to run our applications in daemon mode, using a different Python virtual environment to handle each application’s dependencies. To do this, we simply define a process group for each app using the WSGODaemonProcess directive, then use the WSGIProcessGroup directive within the context of the app to tell Apache to run that app using that process group. The new configuration file will look like this:

<VirtualHost *:80>

    ServerName mysite.com
    ServerAlias www.mysite.com

    WSGIDaemonProcess myproc1 processes=1 python-home=/path/to/myenv1/venv
    WSGIScriptAlias /myapp1 /path/to/myapp1/wsgi/main.wsgi
    <Directory /path/to/myapp1/wsgi>
        WSGIProcessGroup myproc1
        Require all granted
    </Directory>

    WSGIDaemonProcess myproc2 processes=1 python-home=/path/to/myenv2/venv
    WSGIScriptAlias /myapp2 /path/to/myapp2/wsgi/main.wsgi
    <Directory /path/to/myapp2/wsgi>
        WSGIProcessGroup myproc2
        Require all granted
    </Directory>

</VirtualHost>

By placing the WSGIProcessGroup directive within the Directory directive for an app, we tell Apache that the app should be handled by the specified process group, over-ruling any global configuration. We can extend this configuration to any number of apps and process groups, allowing us to control precisely the processes and dependencies used to handle each app. For example, suppose we have two apps with the same dependencies that we want to run using a single process pool, and a third app with its own dependencies that we wish to run using a single dedicated process. We could configure this as follows:

<VirtualHost *:80>

    ServerName mysite.com
    ServerAlias www.mysite.com

    WSGIDaemonProcess myproc1 processes=2 python-home=/path/to/myenv1/venv
    WSGIScriptAlias /myapp1 /path/to/myapp1/wsgi/main.wsgi
    <Directory /path/to/myapp1/wsgi>
        WSGIProcessGroup myproc1
        Require all granted
    </Directory>

    WSGIScriptAlias /myapp2 /path/to/myapp2/wsgi/main.wsgi
    <Directory /path/to/myapp2/wsgi>
        WSGIProcessGroup myproc1
        Require all granted
    </Directory>

    WSGIDaemonProcess myproc3 processes=1 python-home=/path/to/myenv3/venv
    WSGIScriptAlias /myapp3 /path/to/myapp3/wsgi/main.wsgi
    <Directory /path/to/myapp3/wsgi>
        WSGIProcessGroup myproc3
        Require all granted
    </Directory>

</VirtualHost>

Now myapp1 and myapp2 will both be handled by daemon processes in the process group myproc1 using one Python virtual environment, while myapp3 will be handled by a daemon process in the process group myproc3, using a different Python virtual environment. By using daemon mode in this way, we have full control over how each application is handled.

Preventing daemons hanging when a sub-interpreter is created

When running a Flask application in daemon mode, mod_wsgi creates multiple Python sub-interpreters to handle requests. This can cause processes to unexpectedly hang when an application makes use of Python C extension modules to bypass Python’s Global Interpreter Lock (GIL), as the extension may not run correctly when a new sub-interpreter is spawned to handle it. If you’ve ever encountered a cryptic ‘Timeout when reading response headers from daemon process’ error in your Apache logs when running a Flask (or Django!) app, this is likely the culprit. To prevent this from happening, we can force each WSGI application to run within the first interpreter created by the process handling the request using the WSGIApllicationGroup directive:

WSGIApplicationGroup %{GLOBAL}

All WSGI applications within the same application group will be handled by the same python sub-interpreter. The GLOBAL application group is just the empty string, which tells Apache to always run the application using the first interpreter created. Of course, this will cause interference when multiple apps are handled by the same process group, so the safest thing to do is to put each application that relies on C extensions in its own process group. Assuming all of our applications fall into this category (e.g. they all use numpy), our final configuration fill might look like this:

<VirtualHost *:80>

    ServerName mysite.com
    ServerAlias www.mysite.com

    WSGIDaemonProcess myproc1 processes=1 python-home=/path/to/myenv1/venv
    WSGIScriptAlias /myapp1 /path/to/myapp1/wsgi/main.wsgi
    <Directory /path/to/myapp1/wsgi>
        WSGIProcessGroup myproc1
        WSGIApplicationGroup %{GLOBAL}
        Require all granted
    </Directory>

    WSGIDaemonProcess myproc2 processes=1 python-home=/path/to/myenv2/venv
    WSGIScriptAlias /myapp2 /path/to/myapp2/wsgi/main.wsgi
    <Directory /path/to/myapp2/wsgi>
        WSGIProcessGroup myproc2
        WSGIApplicationGroup %{GLOBAL}
        Require all granted
    </Directory>

    WSGIDaemonProcess myproc3 processes=1 python-home=/path/to/myenv3/venv
    WSGIScriptAlias /myapp3 /path/to/myapp3/wsgi/main.wsgi
    <Directory /path/to/myapp3/wsgi>
        WSGIProcessGroup myproc3
        WSGIApplicationGroup %{GLOBAL}
        Require all granted
    </Directory>

</VirtualHost>

Now, every application will run in daemon mode using a different Python virtual environment, and all requests for an app will be handled in the context of the same Python interpreter, preventing processes from hanging due to C extensions being unable to run in sub-interpreters.

Although we’ve focused on Flask in this post, the same principles apply when using mod_wsgi with other frameworks such as Django. Hopefully this will help you get your apps up and running on an Apache server with minimal fuss and no cryptic mod_wsgi errors!

Author

Fergus Boyles

View all posts