Friday, 16 December 2011

REST and HATEOAS - Problems with Clients

At Level 3 of the REST maturity model, the client is meant to bind to two things - the entry URI for the system and the media type(s) of the entities sent and received.

The client is then meant to retrieve any other URIs from links embedded in the entities - links it can recognize by semantically meaningful rel attributes. In theory this means that the providers of the service can then arbitrarily change the URI patterns at which its resources reside, as the client can instantly programatically react to this change.

I have a couple of concerns with this theory, which I'm going to explore in this blog post.

  1. Programming such a client is more arduous

    Imagine that as a client there's some specific resource you want to retrieve - perhaps a representation of a customer. If you are obeying the principles of HATEOAS you go to the root URI, and retrieve the following resource:

    {
      ...
      "link": {
        "rel":"customer",
        "href-pattern":"/customers/${customerName}"
      }
      ...
    }
    
    You then find the link with a rel of customer, get the href-pattern attribute, and use that to build the URI to the customer you want. Then you make sure you have configured a sensible HTTP cache for the root resource, otherwise that extra request to the root resource is an expensive overhead on every request for a customer.

    Or you behave badly, disobey HATEOAS and hard code the URI pattern:

    /customers/${customerName}

  2. URIs cannot be hidden from the client

    Many programming languages provide a means to enforce which elements are public, and hence suitable for clients to bind to, and which are not - normally with visibility modifiers of some form. There are frequently ways for clients to get round these restrictions, but at a bare minimum it makes it very clear to the client that they are not meant to be doing it and upgrades are likely to break them.

    There is no way to do this with a URI. If a badly behaved client decides to bind to some resource specific URI deep in your API, there's nothing to stop them and indeed nothing beyond documentation & knowledge of HATEOAS (which in my experience is still pretty limited in the general marketplace of software developers) to suggest to them that they are doing the wrong thing.

    In addition, unlike a programming library which is upgraded at the client's whim and so the client can do thorough tests to ensure the upgrade has broken nothing, a remote service may alter without the clients' knowledge; in this sense binding to a private API in a library is much safer than binding to a private API in a remote service.

    In practice this means that if the publishers of the service do decide to change the URI patterns they must do so in the knowledge that they may be breaking badly behaved clients; clients who may well not consider themselves to be behaving badly.

    There are circumstances where this may be acceptable. Those offering a free and desirable service to the general public may well take the attitude that if a client breaks because the client has not obeyed the HATEOAS contract, that's the client's problem. This is a nice situation to be in, but there are other circumstances in which it is more difficult to be so ruthless.

    If, as is common, the service is either an internal or external business-to-business service then in my experience if things stop working it's the people who changed something who are held responsible. Asserting that it is the client's fault for being insufficiently robust rarely cuts it with the top brass. The onus is generally on the people wishing to make a change to ensure it will not break their clients, rather than on clients to be robust. This is made particularly difficult as there is no way from the server side to tell whether a client is well behaved or not - they still hit the same URIs. Only actually looking at their code can show you whether or not you will break them before you do so. Alternatively, if the service is to the general public but requires payment the client is a paying customer, which makes their unhappiness a significantly bigger deal.

That combination of it being more arduous for your customer to create a HATEOAS client and there being nothing to discourage them from binding to specific URI patterns seems to me fairly toxic. In practice it seems to me that a real world service is quite likely to have to treat its URIs as part of its public API.

Monday, 7 November 2011

Apache HTTPD - settings for use as a pure proxy

When using Apache HTTPD as a pure proxy to an application server, it may be useful to set AllowEncodedSlashes to "On" and set nocanon on the ProxyPass. This has the effect that the URIs are passed to the application server "as is" without Apache doing any security check on them or otherwise attempting to correct them. Naturally this puts the onus on the origin server to be secure.

It may also be useful to set retry=0. By default after a failure to get a response from the origin server HTTPD caches the fact that the origin server is unavailable for a minute. This is a pain when automating deployments. Setting retry=0 makes it genuinely proxy every call down to the origin server regardless.

    AllowEncodedSlashes On
    ProxyPreserveHost On
    ProxyPass / http://localhost:8080/ retry=0 nocanon
    ProxyPassReverse / http://localhost:8080/

Along the same lines, when using Tomcat to serve RESTful requests it may be useful to allow encoded slashes. This is turned off by default because if you have a servlet that serves up files it may allow an attacker to retrieve arbitrary files from your server using ../../ type paths. If you are mapping all URLs to servlet(s) that do not do this then you can re-enable them using the following command line arguments:

-Dorg.apache.tomcat.util.buf.UDecoder.ALLOW_ENCODED_SLASH=true
-Dorg.apache.catalina.connector.CoyoteAdapter.ALLOW_BACKSLASH=true
or by adding the following to $CATALINA_HOME/conf/catalina.properties:
org.apache.tomcat.util.buf.UDecoder.ALLOW_ENCODED_SLASH=true
org.apache.catalina.connector.CoyoteAdapter.ALLOW_BACKSLASH=true

Thursday, 21 April 2011

Setting the Default Java File Encoding to UTF-8 on a Mac

Recently had a lot of pain with tests containing non-ASCII characters; both Fitnesse and the GMaven* plugin for Maven were failing to read UTF-8 files correctly because they were picking up the MacRoman character set that is the default for Java on the Mac.

You can start java up with an argument -Dfile.encoding=UTF-8, but that's a pain at best and very difficult if using tools that start JVMs up themselves.

However, this answer on stackoverflow gave me a way to set it for all Java everywhere on the Mac:

export JAVA_TOOL_OPTIONS=-Dfile.encoding=UTF-8

If you want this to be persistent, then add this property as explained in this other answer on stackoverflow.

* There's a bug in GMaven 1.3 - it doesn't obey the project.build.sourceEncoding property in Maven, for test files at any rate

Tuesday, 1 February 2011

Release of sysout-over-slf4j 1.0.2

I'm pleased to announce the release of sysout-over-slf4j version 1.0.2, a bridge between System.out/err and SLF4J.

Release 1.0.2 fixes a concurrency issue when multiple threads were writing to a system printstream at the same time by maintaining the contract of the existing System out and err printstreams and synchronizing on them when printing. Also fixes the problem with lines in a single stack trace being interleaved with other messages printed on the same print stream.

The project page is here:
http://projects.lidalia.org.uk/sysout-over-slf4j/

The artifacts can be downloaded here:
http://github.com/Mahoney/sysout-over-slf4j/downloads

Or via Maven:


   uk.org.lidalia
   sysout-over-slf4j
   1.0.2