in Uncategorized

Atom as a General Purpose Format

A format is just what it is – a format. Dare Obasanjo recently commented that "the Atom syndication format has been as successful or perhaps even more successful than originally intended because it’s original scenarios are still fairly relevant on today’s Web". True. This format is relevant for what it was designed for. However, there has been a slow trend to use Atom as a general purpose payload format for RESTful applications. One of the arguments that gets used is that, standardizing on such a format makes all services consistent and easy to use. This approach, IMO, is suboptimal, and the benefits of doing so are, in some cases, pedantic.

A case in point is Google’s Code Search Data API that I looked at recently. This API uses the Atom Syndication Format. A GET request to a URI such as http://www.google.com/codesearch/feeds/search?q=atom returns an Atom feed document. Each entry in this entry is a code search result, which looks like the following.

<entry>
    <id>http://www.google.com/codesearch/p?hl=en#zhw7s4tDL7M/IssueDealerWeblog/atom.py<id>
    <updated>2009-04-29T19:34:24Z<updated>
    <author>
      <name>Code owned by external author.<name>
    <author>
    <title type="text">IssueDealerWeblog/atom.py<title>
    <link rel="alternate" type="text/html"
          href="http://www.google.com/codesearch/p?hl=en#zhw7s4tDL7M/IssueDealerWeblog/atom.py&q=atom"/>
    <gcs:package
        name="http://freshmeat.net/redir/issuedealer/38032/url_tgz/IssueDealer-0.9.120.tar.gz"
        uri="http://freshmeat.net/redir/issuedealer/38032/url_tgz/IssueDealer-0.9.120.tar.gz">lt;gcs:package>
    <gcs:file name="IssueDealerWeblog/atom.py">lt;gcs:file>
    <content type="text/html">lt;pre> 61: class <b>atom</b>(
      OFS.Folder.Folder,
      </pre><content>
    <gcs:match lineNumber="43" type="text/html">lt;pre>class <b>atom</b>_entry:
      </pre><gcs:match>
    <gcs:match lineNumber="61" type="text/html">lt;pre>class <b>atom</b>(
      </pre><gcs:match>
    <gcs:match lineNumber="119" type="text/html">lt;pre> arguments_ = REQUEST['PATH_INFO'][REQUEST['PATH_INFO'].rfind('/<b>atom</b>')+6:].split('/')
      </pre><gcs:match>
    <gcs:match lineNumber="148" type="text/html">lt;pre> """<?xml version="1.0"
      encoding="UTF-8"?><feed xmlns="http://purl.org/<b>atom</b>/ns#">"""
      + 
      </pre><gcs:match>
    <gcs:match lineNumber="149" type="text/html">lt;pre> """<link rel="service.post"
      href="%s" type="application/x.<b>atom</b>+xml" title="%s"
      />""" % (
      </pre><gcs:match>
    <gcs:match lineNumber="150" type="text/html">lt;pre> self.getParentNode().get_weblog_url() +
      '/<b>atom</b>', escape(self.getParentNode().get_title())) + 
      </pre><gcs:match>
    <gcs:match lineNumber="151" type="text/html">lt;pre> """<link rel="service.feed"
      href="%s" type="application/x.<b>atom</b>+xml" title="%s"
      />""" % (
      </pre><gcs:match>
    <gcs:match lineNumber="152" type="text/html">lt;pre> self.getParentNode().get_weblog_url() +
      '/<b>atom</b>.xml', escape(self.getParentNode().get_title()))
      </pre><gcs:match>
    <gcs:match lineNumber="153" type="text/html">lt;pre> self.REQUEST.RESPONSE.setHeader('content-type',
      'application/x.<b>atom</b>+xml')
      </pre><gcs:match>
    <rights>GPL<rights>
  <entry>

In this entry, almost everything that the client cares about is an extension, i.e. an element in the "http://schemas.google.com/codesearch/2006" namespace. A general purpose feed reader or feed API won’t be able to make much sense from this.

The fact that this representation had to rely on extensions so heavily makes me wonder of the benefits of using the Atom Syndication Format for this representation. Of course, it is entirely possible to argue that, any general purpose feed reader or Atom client can interpret the Atom-specific elements in this entry, and "value-added" clients can interpret the extensions to provide more features. It is true. But this argument is not strong enough.

I am not undermining Atom's built in extensibility to support arbitrary content (under atom:content), its emphasis on hyperlinks between representations and resources (via atom:link), and the abstraction of resource collections (as atom:feed). These are important characteristics for any representation format.

So, when would I consider the Atom Syndication Format? I would consider it when my resources are feeds or entries, or represent content. Even then, I would not blindly apply this format for all resources in my apps. I would reserve it for only those resources that share these characteristics. For others, I will use whatever format makes sense, including HTML, binary formats, CSV or even plain text.

Write a Comment

Comment

  1. you’re right that feeds full of extensions usually are a good reason to look a bit closer. but comparing Atom with “HTML, binary formats, CSV or even plain text” kind of misses the point, i think. Atom to me is more like a meta-format that implements a portal into your collection. this may not be required in all cases, and even if it is, Atom may not always be the best choice for it. but it does provide a pretty handy model for the collection/member pattern.

    comparing Atom for the web services web with HTML for the document web is something i like to do. HTML is a really bad document format on many different levels. it is not great at anything. but it is great in allowing so many different things to be expressed in at least some way, and it is great because it allows misuse/specialization in a variety of ways. it is of course often painful to see how HTML is being (mis)used, but it is exactly this ability of HTML to be used in so many different ways that gave us the web and the tremendous value of so much content being crammed into this one container format. many bad things happen along the way, but many good things as well.

    personally, my prediction is that Atom will go a route similar to that of HTML: most people using it will not care so much about how it “should be used in a perfect world”, they will just use it and clients will have to deal with it. and if you’re good at dealing with it as a client, you will have access to a set of resources which otherwise simply would not be available.

  • Related Content by Tag