Feeds and HTTP
I was curious to see how good various blog readers are at HTTP content-negotiation, and over the weekend, I tried some experimentation. Content-negotiation in HTTP 1.1 is a very useful feature which lets the web server tailor the response based on certain HTTP request headers. One popular example is to serve localized content based on the Accept-Language header. I set to try out a similar style negotiation based on what kind of content the client is capable of processing - in my case the clients are blog readers, and the results were disappointing.
My idea was to publish just one feed URL, and rely on content-negotiation to determine whether to return an Atom feed, or an RSS feed, or just return an HTML page. The HTTP request header “Accept” fits well for this purpose. My algorithm was simple. If the client prefers “application/atom+xml”, return an Atom feed, or if the client prefers “application/rss+xml”, return an RSS 2.0 feed, or by default return, return an HTML index page.
My first task was to prepare Apache for content-negotiation, which turned out be a very simple exercise. I added a line AddHandler type-map .var in my .htaccess file, and added a type-map file “feed.var” with the following contents:
URI: index.html URI: atom.xml Content-type: application/atom+xml; qs=0.8 URI: index.xml Content-type: application/rss+xml; qs=0.8 URI: index.html Content-type: */*; qs=0.2
Some experiments with wget using different content-types proved that the content-negotiation is working properly. The disappointing part started with various popular blog readers. I tried Google Reader, Google Home Page, Bloglines and My Yahoo. I pointed each of these readers at the “feed.var”, and none of the readers were able to negotiate the content.
After some debugging, I realized that none of these readers are sending feed-specific Accept headers.
Of these readers, Yahoo sends an Accept: */* header, but it could not parse the default content (index.html above). After mappping Accept: */* to the Atom feed, I was able to get Yahoo recognize the feed, but doing so defeats the purpose of content-negotiation. Sending */* is very misleading, and should be used as the last resort.
Other readers did not send any Accept header. In fact, Google home page says that the feed URL is not found, implying a 404, instead of reportig that it could not understand the response.
It is disappointing to find that these popular feed readers/aggregators can’t deal with basics of HTTP.



No comments yet.