10:51 AM, Friday, April 25, 2008

Content Negotiation is not Broken

Content negotiation in HTTP 1.1 is not broken. It is insufficient and incomplete, but the way it was defined in RFC 2616 works for what it is intended for. What is broken includes HTML, user agents, and buggy servers. Web services developers should not use the argument that content negotiation is broken and avoid supporting it altogether or support it poorly. I don't know if this was the rationale used by Twitter, but Twitter APIs do not understand content negotiation.

For example, take the following GET request (edited for brevity).

> GET /statuses/friends.json HTTP/1.1
> Host: twitter.com
> Accept: application/xml
> 
< HTTP/1.1 200 OK
< Date: Fri, 25 Apr 2008 17:06:39 GMT
< Status: 200 OK
< Content-Type: application/json; charset=utf-8
< Content-Length: 2
< Connection: close
< 
* Closing connection #0
[...]		

The response I get back is always of type application/json no matter what Accept header I send. What determines the response content type is the string that follows the dot in the last path segment of the request URI. That is, Twitter's representations are not negotiable in the HTTP 1.1 sense. The client needs to rely on Twitter's documentation to know that .xml at the end of the URI implies that client is asking for application/xml and not, let's say, application/atom+xml or application/json. This style of URI-based negotiation will break automated client applications that understand HTTP 1.1. Another problem with this approach is that, it forces the API to come up with new extensions for URIs for new media types, like .text, .svg and so on.

To be clear, content-negotiation in HTTP 1.1 is not complete. For instance, there is no way for a server to indicate what kind of media types it can represent a given resource in. That is, by purely relying on HTTP 1.1, a client can not discover the media types. HTML and Atom fix this via the link tag.

Most of the frustration with content negotiation is rooted in HTML. While you can indicate that alternative representations are available via the link tag for machine consumption (e.g. a blog reader to discover an Atom feed representation of my blog), you can not encode this information into a tags. That is, the following is not possible.

	<a href="URI to a resource" rel="alternate" types="application/atom+xml,application/json,text/html"/>

This forces multiple representations to be explicitly encoded into HTML each with a distinct URI.

	<a href="/foo.html">HTML</a>
	<a href="/foo.atom">Atom feed</a>
	<a href="/foo.atom">JSON</a>

See On Linking Alternative Representations To Enable Discovery And Publishing for more relevant discussion. As Dare Obasanjo once argued, "content negotiation has failed to take hold on the web". But that does not justify the Twitter example above. APIs like that of Twitter are mainly for machine consumption, and should implement content negotiation. A post by Frank Sommers at Artima tries to justify separate URIs for each media type so as to keep the server-side code simple, but I would argue that simplicity in server-side code can still be achieved. It just takes a little more design and discipline.

Leave a comment