In the ideal world, what we see is the current. In the distributed software world, what we see may be stale. We can't tell. Would not it be nice to specify a cache invalidation API such that the source of the change can notify everyone that it changed? That is what an OpenSocial 1.0 draft aims to do.
Containers MUST support the invalidation endpoint even if they do not perform any caching and MUST provide an entry for it in their XRDS. To invalidate content a developer’s backend notifies the container of the content it wishes to invalidate by making a 2-legged OAuth call to the Cache service with one or many keys to be invalidated. The consumer key in the 2-legged OAuth call is used by the container to identify the calling application.
This text is followed by an example.
POST /api/rest/cache/invalidate HTTP/1.1
Host: api.example.org
Authorization: hh5s93j4hdidpola
Content-Type: application/json
{
"invalidationKeys" : [
"http://www.myapp.com/gadgetspec.xml",
"http://www.myapp.com/messagebundle.xml"
]
}
Excellent.

{ 9 comments… read them below or add one }
Heh. This is about the tenth such proposal I’ve seen; e.g.,
http://tools.ietf.org/html/draft-danli-wrec-wcip-01
http://www.w3.org/TR/esi-invp
There has been a *lot* of research and proposals in this area. This sort of approach is simple and straightforward, but it requires the server to keep a lot of state, and doesn’t scale very well (at least not to Internet-scale, which is what we shoot for on the Web).
Generally what you find with invalidation is that you need to trade off immediacy vs. scalability vs. reliability; you can sometimes get two, but not all three.
For example, in this proposal, what happens when the server can’t get an invalidation to the client, due to a network segment, or the client’s IP changing?
The spec also ignores the case of forward and reverse proxy caches not in the control of the OpenSocial containers, and sets false expectation to callers. I opened a thread over the weekend explaining why this API is not workable in practice.
I implemented a similar pattern within my exyus framework a while back [1] allowing from both an immediate cache invalidation and a background cache invalidation option for clearing up local caches.
It worked well for maintaining local caches and was handy when I knew of remote target caches (reverse proxy caches). However, I never implemented a subscription model ala cache channels to make it possible for unknown third-party caches to receive the same information.
[1] http://exyus.googlecode.com/svn/trunk/Exyus.Samples/TaskList.cs
Mike reminds me of the other (and very obvious) way to to do this; using HTTP PURGE.
Memories of writing Perl to do this from a database to 100+ caches around the world when I worked at Merrill Lynch back in the 90’s… when there was a Merrill Lynch :)
Oh dear oh dear.
Cache invalidation is thinking of things the wrong way round. The HTTP cache header says “this may become this stale and the sky will not fall in”. If the sky will fall in, don’t add that header, and use etags and serve up 304s *quickly*.
Think of the cache expiry as a contract, you cannot revoke it once it has been signed. And invalidation really does not scale.
Oh, and if you do want to invalidate, just PURGE the URLs you want to purge as mentioned by Mark, rather than a POST to somewhere random like this API which is then going to have to find the right stuff on the proxies to purge.
“invalidation really does not scale.”
Hi Justin – Please could you explain the reasoning here?
Not sure what Justin had in mind when he made that comment, but consider, for instance, maintaining the list of keys that are stale and the respective caches that those keys may be stored in.
I agree that there are challenges to implementation, but there are a couple of things that bother me about the sentiment that ‘it doesn’t scale’:
- ‘Peering’ of caches is really a separate issue to the cache invalidation mechanism itself, at least from an HTTP perspective.
- To say a problem is challenging doesn’t necessarily mean that it is infeasible and/or will not scale. I am interested to know if it has been proved that this is the case.
Justin – spot on.
Cache channels is often described as invalidation, but it’s really just a (potentially) perpetual extension of freshness — roughly, “if you’re still in touch with me, and I haven’t said that it’s stale, you can consider it fresh a bit longer.” That makes it compatible with the existing HTTP caching model, unlike invalidation, as you point out.
{ 1 trackback }