12:12 PM, Monday, December 10, 2007

Vary Header for RESTful Applications

As I was looking at some design issues around some REST APIs, I had a chance to investigate the HTTP/1.1 Vary header. As it turns out, there are interesting implications around setting or not setting this header or setting this header incorrectly. Here is my notes on the rationale behind the Vary header, and design guidelines.

In HTTP, a resource can have multiple representations. The most common example is an HTML page localized in multiple languages, with the web server responding with a given localized version for a given HTTP request. When a resource has multiple representations, either the client (the User-Agent) or the server need to decide which representation to serve. HTTT/1.1 discusses two approaches (naturally) for this - one is called agent-driven negotiation, where the user-agent asks for a specific representation; and the other is called server-driven negotiation, where the server decides to serve a specific representation. In the case of server-driven negotiation, the server could use specific request headers or some other logic to determine the best representation to serve the client.

Some examples of server-driven negotiation include:

  • Serving a page in a given language, e.g., based on the Accept-Language request header
  • Serving a page in a given media type, e.g., based on the Accept request header
  • Or, serving a page based on the user's location (e.g. derived from the IP address)
  • Or, some other criteria such as user's personalization preferences

In all these cases, the same request URI could result in a different representation based on request headers, or some other logic built into the server. This poses an interesting problem for the user-agent as well as intermediate proxies. Let's say, the user-agent supports caching. It can not simply use the request URI as a cache key. That is because, the same URI could return a different representation from the server, and the URI is not unique-enough to represent a key into the cache. This is also true for any caching proxy hosted between the user-agent and the server.

The Vary header in HTTP/1.1 helps address this problem. The server can use this response header to indicate the client of the list of request headers it uses to resolve a given URI to a representation. In cases the server does not use request headers or uses some other criteria for this resolution, it can also indicate that fact. For the examples above, here are the possible Vary header values.

  • Vary: Accent-Language
  • Vary: Accept
  • Vary: *
  • Vary: *

In the first two examples, the server is indicating that it varies representations based on particular request headers. A user-agent or a cache can therefore use this knowledge in computing cache keys.

In the latter two examples, the server is indicating that it varies representations for the given URI, but using a criteria only known to itself. In these cases, the user-agent or an intermediate caching proxy can not assume that a given cached copy of a representation is still valid without first contacting the server.

So, for any server using some kind of server-side negotiation, setting an appropriate Vary header is important. If not, the user-agent or caching proxies will not be able to able correctly identify cached representations, or worse yet, could

In the context of RESTful applications, there are some scenarios that interest me.

Response format

The first one is selecting a response format for a given request. Most REST APIs these days support XML and JSON representations. Some REST servers require clients to send a request parameter indicating the format it chooses to receive, as in

GET /foo/bar?format=json

While this is convenient for testing in browsers or command-line tools like wget or curl, a better way is to rely on server-side negotiation, and sending an Accept header, as in

GET /foo/bar
Accept: application/xml

However, if a server decides to follow this approach, it should also set a Vary header accordingly:

Vary: Accept

This gives caching proxies a chance to cache the responses correctly.

Response compression

Another scenario is to support compression of responses, particularly useful for transporting large chunks of data. Let's say, a server supports gzip based compression, and would like to take advantage of it if the server supports it. To start with, the client first sends a Accept-encoding header, as in

GET /foo/bar
Accept-Encoding: gzip

If the server supports gzip compression (which would be another representation of the same resource), it should indicate so, by including a Vary header in the response as in

Vary: Accept-Encoding

and if the server also decides to apply gzip compression to the current response, it should also supply the Content-Encoding header, as in

Vary: Accept-Encoding
Content-Encoding: gzip

There are other potential use cases, see for example, this thread on the ietf-http-wg mailing list.

What happens if your server does vary responses but does not indicate so via the Vary header? Most likely it will result in cache corruption. Let's say, you have pieces of JavaScript, one that requires an XML form of the response and another that requires a JSON form. With browser caching in place, whether both scripts get the correct form of the response or not depends on which response was fetched first and cached.

But Vary header is not without its issues. The most well-known is that most versions of Internet Explorer don't deal with this header correctly. See Internet Explorer and cacheing: beware of the Vary. Few months ago, Mark Nottingham reported some issues around this header with various caching proxy servers. Something to watch out for.

Leave a comment