In more projects than I care to think about, I’ve seen a pattern that I dislike the more I see it. It appears quite innocent, but it brings about cost with no discernible benefit.
Including all of the application’s source code on each and every call being made to a page. It may be that one file is included that lists all files the app brings along; it may be a list of require_once statements in every entry page it has. But what is the benefit? Even if the page loads only those files that it might eventually need, it probably includes more than is warranted given the task it is currently given.
One solution is to get into using an autoloader; let the interpreter figure out by itself whether it has already seen all that it needs to execute a given piece of code. The other option is to require external files only at the point that you’re certain that you need them. Does your code do input sanitizing and validation before it loads the classes that then work with the sanitized values? Probably not – it’s much more common to first load all the code, and only then start to work with what you are given.
Lazy loading means that the interpreter only runs on those code parts it needs, helping to contribute to better application performance (because we’re not spending time on code we don’t really need, anyway). This also means using less resources on the server, which means better use of resources – ultimately, having your code use less electric power. But there is even more benefits: code that has not been loaded cannot be causing any kind of interference; you’re certain that you don’t have to look into those files when you’re debugging. And then, code that has not been loaded also cannot be used for security exploits – so you have less side effects there as well.
It’s not that this is a particularly complex intellectual challenge, it’s more a matter of perspective and maybe writing infrastructure code for your app. But to me, the benefits are worth the few more minutes spent whilst thinking about your code.
We’re currently in the process of transitioning our web servers from NetBSD to DragonflyBSD; along with that, we’re also switching our PHP platform to php-fpm. There are a few lessons we learned in the process.
First: At least on DragonflyBSD 2.10, apache2 does not at all perform well as apache-mpm-worker. Switching to apache-mpm-prefork changed our CPU load from 98% of apache to about 3 – 5% of apache.
Also, php-fpm was delivering too high a rate of 500 errors; this was not acceptable to our customers. Investigations into that lead me to the blog article at http://alexcabal.com/installing-apache-mod_fastcgi-php-fpm-on-ubuntu-server-maverick/#comments, which then sent me link-chasing to – ultimately – http://article.gmane.org/gmane.comp.web.fastcgi.devel/2514. This means we now have a set of local patches to our pkgsrc tree that incorporate the patch from this posting. At first sight, this seems to have improved the situation.
I’m currently writing a library of PHP stuff for our internal use. I’ve been able to make it do a few fun tricks.
To express a query with a subquery, I can now do this:
$sube = new Expression('subtable');
$e = new Expression('table');
I like that!
While writing the code to handle a small form in PHP, I just realized that I have a very bad habit — and many just do the same.
When I write a new file, I place all the
includes at the very top, before anything else happens. But in my current script, there are many code paths that do not require the major part of all those includes. Only in one specific instance do we require the bulk of the code. Previously, any invocation of that script would have gotten all the code dragged in. Now, I’ve moved the
include to just where I need the code (basically, going into a specific case of a larger
switch statement … And the load on the web server has just been reduced, without any change to the functionality.
So why do we all put the includes on top?