Since I’ve changed institutions this year, I am in the process of migrating Stemmaweb from its current home (on my family’s personal virtual server) to the academic cloud service being piloted by SWITCH. Along the way, I ran into a Perl Catalyst configuration issue that I thought would be useful to write about here, in case others run into a similar problem.
I have several Catalyst applications – Stemmaweb, my edition-in-progress of Matthew of Edessa, and pretty much anything else I will develop with Perl in the future. I also have other things (e.g. this blog) on the Web, and being somewhat stuck in my ways, I still prefer Apache as a webserver. So basically I need a way to run all these standalone web applications behind Apache, with a suitable URL prefix to distinguish them.
There is already a good guide to getting a single Catalyst application set up behind an Apache front end. The idea is that you start up the application as its own process, listening on a local network port, and then configure Apache to act as a proxy between the outside world and that application. My problem was, I want to have more than one application, and I want to reach each different application via its own URL prefix (e.g. /stemmaweb, /ChronicleME, /ncritic, and so on.) The difficulty with a reverse proxy in that situation is this:
- I send my request to http://my.public.server/stemmaweb/
- It gets proxied to http://localhost:5000/ and returned
- But then all my images, JavaScript, CSS, etc. are at the root of localhost:5000 (the backend server) and so look like they’re at the root of my.public.server, instead of neatly within the stemmaweb/ directory!
- And so I get a lot of nasty 404 errors and a broken application.
What I need here is an extra plugin: Plack::Middleware::ReverseProxyPath. I install it (in this case with the excellent ‘cpanm’ tool):
$ cpanm -S Plack::Middleware::ReverseProxyPath
And then I edit my application’s PSGI file to look like this:
use strict;
use warnings;
use lib '/var/www/catalyst/stemmaweb/lib';
use stemmaweb;
use Plack::Builder;
builder {
enable( "Plack::Middleware::ReverseProxyPath" );
my $app = stemmaweb->apply_default_middlewares(stemmaweb->psgi_app);
$app;
}
where /var/www/catalyst/stemmaweb is the directory that my application lives in.
In order to make it all work, my Apache configuration needs a couple of extra lines too:
# Configuration for Catalyst proxy apps. This should eventually move
# to its own named virtual host.
RewriteEngine on
<Location /stemmaweb>
RequestHeader set X-Forwarded-Script-Name /stemmaweb
RequestHeader set X-Traversal-Path /
ProxyPass http://localhost:5000/
ProxyPassReverse http://localhost:5000/
</Location>
RewriteRule ^/stemmaweb$ stemmaweb/ [R]
The RequestHeaders inform the backend (Catalyst) that what we are calling “/stemmaweb” is the thing that it is calling “/”, and that it should translate its URLs accordingly when it sends us back the response.
The second thing I needed to address was how to start these things up automatically when the server turns on. The guide gives several useful configurations for starting a single service, but again, I want to make sure that all my Catalyst applications (and not just one of them) start up properly. I am running Ubuntu, which uses Upstart to handle its services; to start all my applications I use a pair of scripts and the ‘instance’ keyword.
description "Starman master upstart control"
author "Tara L Andrews (tla@mit.edu)"
# Control all Starman jobs via this script
start on filesystem or runlevel [2345]
stop on runlevel [!2345]
# No daemon of our own, but here's how we start them
pre-start script
port=5000
for dir in `ls /var/www/catalyst`; do
start starman-app APP=$dir PORT=$port || :
port=$((port+1))
done
end script
# and here's how we stop them
post-stop script
for inst in `initctl list|grep "^starman-app "|awk '{print $2}'|tr -d ')'|tr -d '('`; do
stop starman-app APP=$inst PORT= || :
done
end script
The application script, which gets called by the control script for each application in /var/www/catalyst:
description "Starman upstart application instance"
author "Tara L Andrews (tla@mit.edu)"
respawn limit 10 5
setuid www-data
umask 022
instance $APP$PORT
exec /usr/local/bin/starman --l localhost:5000 /var/www/catalyst/$APP/$APP.psgi
There is one thing about this solution that is not so elegant, which is that each application has to start on its own port and I need to specify the correct port in the Apache configuration file. As it stands the ports will be assigned in sequence (5000, 5001, 5002, …) according to the way the application directory names sort with the ‘ls’ command (which roughly means, alphabetically.) So whenever I add a new application I will have to remember to adjust the port numbers in the Apache configuration. I would welcome a more elegant solution if anyone has one!
I wasn’t keen on having services running for each catalyst app, and dealing with ensuring they stay running – so I let apache handle it all with fastcgi.
I gather this is a rather out-of-date way of doing it, but it’s always been reliable for me, and I don’t have heavy-enough traffic to have to worry much about efficiency.
I installed `mod_fcgid`, and my apache config looks like this for each app:
Adding a new catalyst app is just a matter of copy/pasting that section, and editing the “app-name” and “/path/to/app”.
I don’t need to edit anything in the app’s code or config to get it to work under this setup.