If-Modified-Since and If-None-Match woes
I’m confused about the correct behaviour of If-Modified-Since
and
If-None-Match
in certain scenarios.
The scenarios I have in mind are:
-
The client claims to have a later representation of a resource than an intermediate cache, but the intermediate cache’s representation is still fresh.
Imagine a cache has a fresh representation of
/resource
as so:HTTP/1.1 200 OK ETag: some_etag Last-Modified: Fri, 26 Jul 2013 14:00:00 GMT
It receives a request as so:
GET /resource HTTP/1.1 If-None-Match: other_etag If-Modified-Since: Fri, 26 Jul 2013 14:00:01 GMT
The nature of ETags is that they have no notion of sequence - it’s either the same or it isn’t, so the cache cannot know whether the ETags differ because one represents a later or earlier version of the resource. However, from my reading of the draft update to the HTTP/1.1 spec HTTP Bis Section 5 the
If-None-Match
header takes precedence overIf-Modified-Since
, and so the correct behaviour here is for the cache to return a200
with its old but fresh representation of the resource. That means that any client needs to be aware that it may quite legitimately receive an earlier version of an entity than the one it already has when making a conditional request to a cache. Which… disturbs me; my whole reason for making a conditional request is to find out if there’s a fresher version, I don’t want an older one!I suppose the theory is that the local client cache and the intermediate cache are obeying the same max-age / expires headers so by the time the client is needing to send conditional requests because its representation is stale the representation in the upstream cache should also be stale. However, this is not guaranteed - a request with a Cache-Control header forcing revalidation that goes via a different intermediate cache (due to load balancing) could easily legitimately leave a downstream client with a later representation than an upstream cache it later consults.
-
The client claims to have a later representation of a resource than an origin server.
Imagine the entity is stored in a database.
Origin node 1 has temporarily cached an old version of the entity with an
ETag
ofsome_etag
and aLast-Modified
ofFri, 26 Jul 2013 14:00:00 GMT
.Origin node 2 meanwhile returns a later version of the entity to the client.
The client then makes a conditional request which happens to get mapped to node 1:
GET /resource HTTP/1.1 If-None-Match: other_etag If-Modified-Since: Fri, 26 Jul 2013 14:00:01 GMT
What should node 1 return? It could:
- Treat this as a trigger to ensure that it updates its own cache.
- Assume that since
If-Modified-Since
is after its notion ofLast-Modified
,304 Not Modified
is safe to return - Assume the client is talking nonsense and simply return a
200
with its older version of the entity
Again, HTTP Bis Section 5 suggests that option 3 is the most “correct”, though it feels very wrong to me. I favour option 1 followed by 3…
(One answer might be “don’t use ETags if you care about this possibility”. But the spec is quite clear that origin servers SHOULD return an ETag if they can…)