Wednesday, 21 April 2010

Accessing Jetty on Ubuntu from another machine

OK, so following on from my previous blog post I thought I'd do some performance testing to see how Jetty compares with Apache. Installed VirtualBox on my Mac, downloaded and installed ubuntu 9.04 server edition with bridge networking so I could access it from the host - all painless. Installed apache2 - painless. Installed jmeter on the mac - painless. Ran some performance tests.

Looked to repeat the experiment against Jetty - aargh!! Just wasted 3 hours trying to work out why I couldn't connect to Jetty from the host. Turns out that the documentation, in Debian/Ubuntu at least, is wrong - when /etc/init.d/jetty claims that leaving JETTY_HOST blank will result in accepting connections from all hosts it's lying. You need to set it to 0.0.0.0 or it will only be accessible from localhost (127.0.0.1). You can tell this because if you do a netstat -an you will see this:


Proto Local Address State
tcp6 0 0 127.0.0.1:8080 :::* LISTEN


that means the bind-address is localhost so the process is inaccessible outside the server. It should look like this:


Proto Local Address Foreign Address State
tcp6 0 0 :::8080 :::* LISTEN


Hope this helps some other poor confused person out there, took me a long time to Google this... there's a Debian issue here:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=554874

Monday, 19 April 2010

Web Application Architecture - do I really need Apache?

Following on from my earlier post on web application architecture, there's another aspect of the common pattern I'm curious about - do I really need Apache?


I had a frustrating time recently performance tuning an application. This was partly because we were shooting at a moving target (performance test servers on a shared virtual environment where resources may be claimed by another server at any moment are not a great idea...), and partly because I still need to improve my knowledge of Apache, Jetty, Linux and performance tuning methodologies. However, I did feel that having Apache in the mix added an extra layer of variables, measurement and configuration complexity; matching available sockets/open files on the operating system to numbers of apache workers to numbers of Jetty connections, bearing in mind cache hits and misses at the Apache level, all got quite complicated.

It made me question why we need Apache. It serves three purposes in our architecture.
  1. Block external access to certain URLs
  2. Put in place redirects when we, or systems we depend on, have problems
  3. Act as our HTTP cache
1 & 2 we could easily do via filters at the application server layer - indeed we're moving that way anyway, as it's then trivial to implement admin pages to toggle these settings and it's also considerably easier to functionally test them.

Which leaves the HTTP cache. I've been reading up on my HTTP caching recently, particularly in an excellent book called "High Performance Web Sites". It got me thinking, and it seemed to me that it wouldn't be that hard to implement an internal HTTP cache in a servlet filter. A quick google revealed that unsurprisingly I wasn't the first to think of this, it being the subject of an O'Reilly article, and indeed that Ehcache have something along those lines.

The advantages I see to using such a cache are fivefold:
  1. You get the reduced complexity of taking a layer out of your architecture. You no longer need to worry about how many connections & threads both Apache and the application server need - there's just the one to get right.
  2. You get to escape from Apache's rather arcane configuration (or is it just me who winces in distaste whenever delving into httpd.conf?)
  3. You can easily get a distributed cache. Astonishingly Apache doesn't seem to have an off-the-shelf distributed implementation of mod_cache, which has been an issue for us - we have requests for pretty rapidly changing data which we really need to cache for short periods of time to protect ourselves from a user perpetually refreshing
  4. The cache will be shared across all threads in the application server, regardless of what form you choose to store it. Did you know that mod_mem_cache is a per process cache if you run Apache on a process-per-request basis (as RedHat does out of the box)? So loading up the cache for a resource has to be done on every single process - and what's worse, by default those processes are configured to die when idle or after a set period of time even if not idle to avoid a memory leak. So that resource you thought was being cached eternally is actually getting requested rather a lot. Apache recommend using mod_disk_cache instead, where the cache is shared across all processes, and relying on the operating system to cache it in memory to speed things up, but my measurements saw a drastic drop off in performance when we tested this. I may of course simply have got something wrong.
  5. A personal one - as a Java dev I'm a lot happier digging into Ehcache's codebase to work out what it's actually up to than I would be in Apache's.
It would be quite reasonable to read that list and snort that perhaps I need to improve my Apache skills - and perhaps that is indeed the case. However there's a level at which that's my point; if I need to read up and gain painful experience in order to know how to bend Apache to my will and really know what it's up to, isn't it reasonable to look for an alternative that I will understand better and faster?

So what are the negatives? Well, I've seen it suggested that Apache, being native code, is faster at shoving binary data to a socket than a JVM is. On the same basis I guess that if you are using a disk based store for your cache Apache may be more efficient at reading data from the disk. I've no proof for either of those theories, though, and it may equally be that modern JVMs are pretty competitive at these things - you'd think they'd be the sort of thing Sun had been optimising like crazy all these years. More reading and/or testing needed.

On I presume the same basis Apache has a reputation for serving static content much faster than an application server (this doesn't really apply to my current project as we use JAWR for much needed efficiency savings, and that means the application server has to serve our static content up the first time; hopefully thereafter it's being served from a cache anyway). I seem to remember being told, however, that modern servlet containers are actually quite competitive on this score now (a good wikipedian would add a nasty little [citation needed] or [where?] to that sentence!).

There may also I guess be security issues with running a servlet container or application server on port 80; Apache has had the benefit of being the market leader and hence the target of so many hackers that you'd hope most of the holes have been stopped up. Though there may be an element of the safety of obscurity by stepping away from Apache.

I'd certainly be interested in experimenting with dropping Apache, running the servlet container on port 80 and using Ehcache or similar to do caching & gzipping down at the Java layer. Perhaps if I do I will find out why no-one else does!

Web Application Architecture - why so many nodes?

I've been thinking recently about the standard architectures we have for Java based web applications. Obviously these do vary, but in my experience they have some pretty common patterns:
  1. Two identical legs (staging and production) that switch on each release
  2. A couple of load balancers on each leg (one for fail over)
  3. Multiple nodes behind these (more than 4 on a relatively high volume website)
  4. Each node having a web server running Apache httpd 2.x, responsible for HTTP caching and possibly serving up static content when it isn't already cached
  5. Each node also having an application server running a servlet container or J2EE application server
  6. A database, typically with at least a couple of nodes for failover reasons
There are other issues, of course, such as how you manage sessions, but they aren't what concerns me at the moment. I'm interested in questioning (from a position admittedly of some ignorance...) an aspects of this pattern.

The first is the rationale for having more than two nodes. On my current project we have four, about to become six. However, we are also deploying into a virtual environment - VMWare based, where you add resources like memory and CPUs via a console. In those circumstances I'm not entirely clear why the response to higher load should be more nodes. Why not simply increase the resources available to each virtual machine? Particularly in a 64bit, multiprocessor age, surely the operating system and JVM can scale up to use the extra resources as they are presented?

I'm partly standing here to be corrected - I'd love to be educated as to why this assumption of mine is naive and wrong! Do sockets and open files still represent a bottleneck in an up-to-date Linux server? Is there just a point of diminishing returns where throwing more CPUs and more memory at a single operating system & JVM combination doesn't scale anywhere like linearly?

I ask simply because otherwise having more than two nodes (obviously I understand you need two for failover) seems to add a lot of complexity if it's not actually giving benefits to match. The more servers there are the more maintenance there is, the longer deployments take, the more you are likely to play hunt-the-log-message from server to server to follow a user's behaviour... you know the deal.

Now in typing that I have to acknowledge another clearly good reason to have more than two nodes - lose one out of two and the traffic going to the other just leapt by 100% at a time when you may well be experiencing pretty high traffic if one of your nodes has gone down. One out of four goes down and the others only have to cope with a 33% jump in traffic. But I'd still be interested to know if that's the sole reason for it, or if there's a scalability issue I need to read up on. Any book suggestions on the topic?

System Out over SLF4J Code Complete

I've just completed a new SLF4J legacy bridging module - sysout-over-slf4j. I've donated it to SLF4J, so I'm hopeful it will soon be available as a binary there and hence in the Maven repositories. However, in the meantime anyone interested can pull it down from my github fork; it's a simple Maven project, a quick mvn package in the root of sysout-over-slf4j should do the trick. If you're interested you could add comments here or on the bugzilla enhancement request

The basic idea is to transform real old school System.out.println("log message") and exception.printStacktrace() into modern logging statements that can be managed by log4j or logback or any other logging system that implements SLF4J or has a conversion layer from it, using the now standard convention of logging to a logger whose name is the same as the fully qualified name of the class doing the logging. Here's the documentation:

System.out and err over SLF4J

The sysout-over-slf4j module allows a user to redirect all calls to System.out and System.err to an SLF4J defined logger with the name of the fully qualified class in which the System.out.println (or similar) call was made, at configurable levels.

What are the intended use cases?

The sysout-over-slf4j module is for cases where your legacy codebase, or a third party module you use, prints directly to the console and you would like to get the benefits of a proper logging framework, with automatic capture of information like timestamp and the ability to filter which messages you are interested in seeing and control where they are sent.

The sysout-over-slf4j module is explicitly not intended to encourage the use of System.out or System.err for logging purposes. There is a significant performance overhead attached to its use, and as such it should be considered a stop-gap for your own code until you can alter it to use SLF4J directly, or a work-around for poorly behaving third party modules.

What needs to be done to make it work?

sysout-over-slf4j.jar should be included on the classpath at the same level as slf4j-api.jar and your chosen slf4j implementation. A static method call sendSystemOutAndErrToSLF4J() should be made on the org.slf4j.sysoutslf4j.context.SysOutOverSLF4J class early in the life of the application to start redirecting calls to SLF4J, or the included org.slf4j.sysoutslf4j.context.SysOutOverSLF4JServletContextListener may be configured in a servlet application.

How does it work?

The System.out and System.err PrintStreams are replaced with new SLF4JPrintStreams. Each time a call to System.out.println (or similar) is made, the current thread's stacktrace is examined to determine which class made the call. An SLF4J Logger named after that class's fully qualified name is retrieved and the message logged at the configured level on that logger (by default info for System.out calls and error for System.err calls).

Calls to Throwable.printStackTrace() are likewise logged at the configured level for each System output. By default there will be a message logged for every line of the stack trace; this is an unfortunate side effect of not being able reliably to retrieve the original exception that is being printed.

A servlet container may contain multiple web applications. If it has child first class loading and these applications package SLF4J in the web-app/lib directory then there will be multiple SLF4J instances running in the JVM. However, there is only one System.out and one System.err for the whole JVM. In order to ensure that the correct SLF4J instance is used for the correct web application, inside the new PrintStreams SLF4J instances are mapped against the context class loader to ensure that the same SLF4J instance used in "normal" logging is also used when calling System.out.println.

In order to prevent classloader leaks when contextsare reloaded the new PrintStreams are created by a special classloader so that they do not themselves maintain a reference to the context classloader. However, since the PrintStreams do maintain a reference to the context's SLF4J instance the user must also stop sending System.out/err to SLF4J in a context before discarding or reloading it to avoid a classloader leak via the stopSendingSystemOutAndErrToSLF4J method on the SysOutOverSLF4J class. This happens automatically if using the provided SysOutOverSLF4JServletContextListener.

Don't most logging systems print to the console? Won't that mean infinite recursion?

Fortunately, Log4J, JULI & Logback all do so through the write methods on PrintStream, which are rarely used for direct logging. Consequently these methods on the new PrintStreams proxy directly to the old System.out/err PrintStreams, allowing these logging frameworks to work as before.

Other SLF4J implementations may not fit this useful pattern. They can be registered with sysout-over-slf4j via static methods on the SysOutOverSLF4J class to permit them to access the console.

What about the overhead?

The overhead for Log4J, JULI and Logback when printing to the console should be minimal, for the reasons outlined above. The overhead for any SLF4J implementation that needed to be registered will be greater; on every attempt by it to print to the console its fully qualified classname has to be matched against registered package names in order to determine whether it should be permitted direct access.

Finally, the overhead of actual System.out and System.err calls will be much greater, due to the expense of generating the thread's stacktrace and examining it to determine the origin of the call. As emphasised above, it would be much better if all logging were done via SLF4J directly and this module were not necessary.