Apache Redirect testing

Folks who are work with Apache HTTPD for any amount of time have probably noticed it has a number of different methods for implementing URL Redirection. Each one has a different set of options and capabilities, but those extra features come at a performance cost.

I’ve built a small experiment to evaluate the performance of the four major options for URL Redirect:

  1. Baseline (no redirect, raw serving of a simple 8-byte file)
  2. Redirect
  3. RedirectMatch
  4. RewriteRule
  5. RewriteMap + RewriteRule

There are certainly a few other methods, and variations on the four above, but these are the most common approaches I am aware of. If you’ve spent any time in IRC asking for help on #httpd about rewrites, the most common response is to use RedirectMatch. This got me thinking: is RedirectMatch the most effective method, since there are so many to choose from? Hence, this small experiment was born.

It turns out, RedirectMatch is a really great performing option for folks, given that they implement in their server config directly. Resist the temptation to implement via .htaccess files using AllowOverride. That is a surefire way to stomp your site performance, and has been covered by the fine folks at Apache HTTPD.

Our test setup is quite simple:

32-bit VM running on a 3.2ghz Intel Core 2 Duo / E6750, acting as both the server and the client. I assume this not a major issue, we are not testing raw performance, just relative performance against itself.

Server version: Apache/2.2.24 (Unix)
Server built: Apr 4 2013 20:32:29
Server’s Module Magic Number: 20051115:31
Server loaded: APR 1.4.6, APR-Util 1.4.1
Compiled using: APR 1.4.6, APR-Util 1.4.1
Architecture: 32-bit
Server MPM: Worker
threaded: yes (fixed thread count)
forked: yes (variable process count

This is compiled freshly from source with the following configure flags:

$ ./configure –enable-rewrite –prefix=/home/www/apache22 –with-included-apr –with-mpm=worker

Our test strategy will go as such:

Client

ApacheBench running 20,000 requests with only 1 connection, using KeepAlive, grabbing a small URL and getting served a redirect. This reduces the overhead of dealing with socket startup/shutdown, and focuses just purely on the rule matching capability. ab Flags used in all my tests:

$ ab -k -c 1 -n 20000 “http://localhost:10000/test/10000”

 Server

HTTPD Setup:

ServerRoot “/home/www/apache22”
ServerName netflows.no-life.com
Listen 10000
DocumentRoot “/home/www/apache22/htdocs”
<Directory />
Options FollowSymLinks
AllowOverride None
Order deny,allow
Deny from all
</Directory>
<Directory “/home/www/apache22/htdocs”>
Options None
AllowOverride None
Order allow,deny
Allow from all
</Directory>

DefaultType text/plain

As you can see, no frills here. We have a test/10000 file out in our htdocs folder with the contents of “success”, though technically this is irrelevant: we are not testing the file serving. We are only testing the REDIRECT generation (which does not serve the file).

I reran these test several times, and the results were consistent.

The results:

Test: Baseline (Serve a file) 10000 Redirect Entries 10000 RedirectMatch Entries 10000 RewriteRule Entries 10000 RewriteMap Entries
Time per request (ms): 0.102 0.161 0.067 1.359 0.088

Summary

RedirectMatch is indeed the fastest performer, followed closely by RewriteMap. RewriteRule is almost 20x slower.

Avoid RewriteRule, unless complexity requires it!

Warning: Tangential topic below!

Confused how these redirects are faster than Baseline test? I was too!

Then it hit me: baseline is serving the file, which has to make several syscalls to the filesystem. Redirects only serve the HTTP 301 or 302 response (no file access).

Compare the following Baseline vs RedirectMatch straces:

Baseline (serving the file):

[pid 12966] futex(0x85723bc, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 12966] getsockname(9, {sa_family=AF_INET6, sin6_port=htons(10000), inet_pton(AF_INET6, “::1”, &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 0
[pid 12966] fcntl64(9, F_GETFL) = 0x2 (flags O_RDWR)
[pid 12966] fcntl64(9, F_SETFL, O_RDWR|O_NONBLOCK) = 0
[pid 12966] read(9, “GET /test.html HTTP/1.1\r\nUser-Ag”…, 8000) = 177
[pid 12966] gettimeofday({1405781496, 804325}, NULL) = 0
[pid 12966] stat64(“/home/www/apache22/htdocs/test.html”, {st_mode=S_IFREG|0664, st_size=8, …}) = 0
[pid 12966] lstat64(“/home/www/apache22/htdocs/test.html”, {st_mode=S_IFREG|0664, st_size=8, …}) = 0
[pid 12966] open(“/home/www/apache22/htdocs/test.html”, O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 10
[pid 12966] open(“/etc/localtime”, O_RDONLY) = 11
[pid 12966] fstat64(11, {st_mode=S_IFREG|0644, st_size=3519, …}) = 0
[pid 12966] fstat64(11, {st_mode=S_IFREG|0644, st_size=3519, …}) = 0
[pid 12966] mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb77e5000
[pid 12966] read(11, “TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\4\0\0\0\4\0\0\0\0″…, 4096) = 3519
[pid 12966] _llseek(11, -24, [3495], SEEK_CUR) = 0
[pid 12966] read(11, “\nEST5EDT,M3.2.0,M11.1.0\n”, 4096) = 24
[pid 12966] close(11) = 0
[pid 12966] munmap(0xb77e5000, 4096) = 0
[pid 12966] mmap2(NULL, 8, PROT_READ, MAP_SHARED, 10, 0) = 0xb77e5000
[pid 12966] read(9, 0xb6c05788, 8000) = -1 EAGAIN (Resource temporarily unavailable)
[pid 12966] writev(9, [{“HTTP/1.1 200 OK\r\nDate: Sat, 19 J”…, 230}, {“success\n”, 8}], 2) = 238
[pid 12966] munmap(0xb77e5000, 8) = 0
[pid 12966] close(10) = 0
[pid 12966] poll([{fd=9, events=POLLIN}], 1, 5000) = 1 ([{fd=9, revents=POLLIN}])
[pid 12966] read(9, “”, 8000) = 0
[pid 12966] gettimeofday({1405781496, 808164}, NULL) = 0
[pid 12966] shutdown(9, 1 /* send */) = 0
[pid 12966] poll([{fd=9, events=POLLIN}], 1, 2000) = 1 ([{fd=9, revents=POLLIN|POLLHUP}])
[pid 12966] read(9, “”, 512) = 0
[pid 12966] close(9) = 0
[pid 12966] futex(0x85723f0, FUTEX_WAIT_PRIVATE, 27, NULL <unfinished …>

RedirectMatch:

[pid 13069] futex(0x91b0414, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 13069] getsockname(7, {sa_family=AF_INET6, sin6_port=htons(10000), inet_pton(AF_INET6, “::1”, &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 0
[pid 13069] fcntl64(7, F_GETFL) = 0x2 (flags O_RDWR)
[pid 13069] fcntl64(7, F_SETFL, O_RDWR|O_NONBLOCK) = 0
[pid 13069] read(7, “GET /test/1.html HTTP/1.1\r\nUser-“…, 8000) = 179
[pid 13069] gettimeofday({1405781627, 250348}, NULL) = 0
[pid 13069] gettimeofday({1405781627, 250886}, NULL) = 0
[pid 13069] open(“/etc/localtime”, O_RDONLY) = 10
[pid 13069] fstat64(10, {st_mode=S_IFREG|0644, st_size=3519, …}) = 0
[pid 13069] fstat64(10, {st_mode=S_IFREG|0644, st_size=3519, …}) = 0
[pid 13069] mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7892000
[pid 13069] read(10, “TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\4\0\0\0\4\0\0\0\0″…, 4096) = 3519
[pid 13069] _llseek(10, -24, [3495], SEEK_CUR) = 0
[pid 13069] read(10, “\nEST5EDT,M3.2.0,M11.1.0\n”, 4096) = 24
[pid 13069] close(10) = 0
[pid 13069] munmap(0xb7892000, 4096) = 0
[pid 13069] write(2, “[Sat Jul 19 10:53:47 2014] [debu”…, 182) = 182
[pid 13069] read(7, 0xb6d05788, 8000) = -1 EAGAIN (Resource temporarily unavailable)
[pid 13069] writev(7, [{“HTTP/1.1 301 Moved Permanently\r\n”…, 211}, {“<!DOCTYPE HTML PUBLIC \”-//IETF//”…, 240}], 2) = 451
[pid 13069] poll([{fd=7, events=POLLIN}], 1, 5000) = 1 ([{fd=7, revents=POLLIN}])
[pid 13069] read(7, “”, 8000) = 0
[pid 13069] gettimeofday({1405781627, 254729}, NULL) = 0
[pid 13069] shutdown(7, 1 /* send */) = 0
[pid 13069] poll([{fd=7, events=POLLIN}], 1, 2000) = 1 ([{fd=7, revents=POLLIN|POLLHUP}])
[pid 13069] read(7, “”, 512) = 0
[pid 13069] close(7) = 0
[pid 13069] futex(0x91b0448, FUTEX_WAIT_PRIVATE, 27, NULL <unfinished …>

If you noticed, serving the baseline file took five additional syscalls. They are as below:

First httpd performs a stat64 the file to make sure it’s a “regular” file (essentially, that it exists):

[pid 12966] stat64(“/home/www/apache22/htdocs/test.html”, {st_mode=S_IFREG|0664, st_size=8, …}) = 0

Next,httpd performs lstat64 to see if the file is a symbolic link:

[pid 12966] lstat64(“/home/www/apache22/htdocs/test.html”, {st_mode=S_IFREG|0664, st_size=8, …}) = 0

Once we have enough information, HTTPD will attempt to open a file descriptor (which gets assigned to FD 10)

[pid 12966] open(“/home/www/apache22/htdocs/test.html”, O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 10

Now, Apache can mmap2 the file, copying it’s contents to memory for httpd to read directly.

[pid 12966] mmap2(NULL, 8, PROT_READ, MAP_SHARED, 10, 0) = 0xb77e5000

And finally, close the file descriptor.

[pid 12966] close(10) = 0

All of this takes significantly more time, since we’re performing filesystem operations (stats/open/read/close). Contrast with RedirectMatch, which can operate without touching any files.

Posted on July 19, 2014 at 11:32 am by Andy · Permalink
In: Uncategorized

Leave a Reply