in Uncategorized

Resource Identity and Cool URIs – Take Two

In response my Resource Identity and Cool URIs, Stefan Tilkov wrote an interesting post with some counter points.

In this post, I would like to expand on the thoughts behind my suggestion to include identifiers in representations, and clear a couple of non-issues.

The first non-issue is whether URIs are permanent or not, and whether my approach to use identifiers is to compensate for potential lack of permanence of URIs. This is a non-issue because, any well-behaving server would use proper redirect rules to make sure that clients can continue to reach a resource even when it changed the URI for any reason. If a server is not doing this, it is broken, and needs fixing.

The second non-issue is whether there is a canonical resource or a canonical URI for every entity that is modeled as a resource. On the web, there are no canonical resources or canonical URIs. Every URI is a URI to a resource.

The key question is whether URI equivalence (or non-equivalence) implies resource equivalence (or non-equivalence) or not.

Given two URIs, it is possible for a client to infer that they both refer to the same resource if the URIs are equivalent (as per Sec 6 of RFC 3986). But the converse is not true. That is, given two URIs that are not equivalent, a client can not reliably say that they do not refer to the same resource.

RFC 3986 does recognize this question, and concludes that

URI comparison is not sufficient to determine whether two URIs identify different resources.

In the context of the "self" relation, the definition is vague enough to leave some flexibility for servers to use the same or different URIs for the same entity in different representations under different contexts.

But why would a server provide different URIs for the same entity?

In general, there may not be many reasons for the server to use different URIs for the same resource. But there may be special cases where the server decides to do so. For instance, the links I get for a resource on an iPhone may refer to specialized compact representations at a domain (e.g., m.example.org) that is different from the links that I get on a desktop (e.g. www.example.org). An app that is synchronizing data between these devices can not rely on URI non-equivalence while comparing resources. Alternatively, as in the case of the examples I used in my original post, the self links may be there to answer the question of where did a client got a given representation from.

In conclusion, as described in RFC 3986, a given client can not rely on URI non-equivalence alone to say that two URIs are not referring to the same resource unless it has full knowledge or control of them (i.e. resources).

Where does this leave us? If I am a server developer, I would do the following.

  1. In the absence of any special requirements, make sure that URIs in self-links always refer to the same URI.
  2. To prevent clients from breaking if I had to break the above in future, include identifiers in my representations.

On the client side, I would do the following:

  1. If a given representation has both a self-link and an identifier, use the identifier to determine resource-equivalence, and URI as the location.
  2. If that representation has only a self-link, use it both for resource-equivalence and the location.

Write a Comment

Comment

  1. The issue of identity is one that has plagued mankind for a LOT longer than our current fascination with it. In 2003, William Kent gave a keynote at the Extreme Markup Languages conference called The Unsolvable Identity Problem. For those of you who aren’t aware, William Kent is one of those scary smart folk you like to meet once or twice in your life. His database-jitsu is mighty.

    I first came across the issue in the context of Topic Maps. Where is it very common to have multiple maps, each with topics that refer to the same subject. Topic Maps are designed to be merged together so you need a way to say that these two topics which come from different sources refer to the same thing. Topic maps share much of the same issues, since a “topic” is to “subject” what a “representation” is to a “resource”.

    In topic maps this is done with a Published Subject Indicator; PSIs are simply URIs. Any two topics with the same PSIs refer to the same thing. This has great value, for instance if I maintain a map of topics about Italian Opera and you maintain a topic Map of the CIA World Fact Book (believe it or not these two exist and form the canonical examples for explaining topic maps). In my opera may I have a topic for Milan which is both the birth place of Verde, the setting for several of his plays and the place where Verde wrote some of them. The world fact book has information on the population and industry of Milan. If I combine these two maps, I now have a lot of information about Milan, which Susan could combine with here “tourist places in europe” topic map to help her create a “Opera Lovers Tour of Italy” or something.

    I’ve digressed. The point is that Public Subject Indicators may or may not be network retrievable (vis isbn:978-0140283334 which would be a URI for a specific edition of the Lord of the Flies). Any network retrievable resource is treated as a static, human parsable resource; its only purpose is to uniquely identify something.

    In an article I wrote on RESTful frameworks I argue that you should not use a REST api for your human intended front end. This is exactly one of the reasons. If you are sending content only to other systems, then you don’t need to have multiple URIs for the same resource; all the logic is left up to the client.

    In fact, your examples, I think, are not very good examples for reusable APIs. This example is, I think, the clearest example of bad choices.


    GET /myapp/people?like=subbu
    Host: www.example.org

    200 OK
    Content-Type: ...

    Subbu
    Allamaraju

    Subbu
    Somebody

    My issue is with the self links for each of the people. How do you, as the application author, know that the the client wants the mini view? That is a very presumptuous, IMHO, unless you control both the client and your application. Since the reason we publish APIs is that we don’t want to control the clients, I would consider this to be a code smell. Further, the example is missing the even more important link that tells the client where to get a complete representation of of the resource.

    I agree that you need someway to identify a resource independently of the URI by which you retrieved it. I do however think that needs to be a URI, and that it should be a URL. If you request this URL, you should NOT get a representation of the resource but rather its meta data. Specifically, I would send URIs for all known representations, their media types and definitions (schemas) etc. Maybe even a simple human readable description.

    I would also argue that a more “correct” version of the above representation might be:


    GET /myapp/people?like=subbu
    Host: www.example.org

    200 OK
    Content-Type: ...

    This is the problem with using “ideal” and easy examples, you miss all the details you discover from a pathological one.

    I’ve talked enough.

  2. Subbu:

    REST is an “instance-only” technology with world-wide scope, there are no notion of “classes”, they are only imaginary.

    In this content, you would think that someone could spend just a few minutes to define an “id” rel (different from self). Who could think that Identity is not critical to the future of REST in the Enterprise? I don’t know how many times I would have to repeat that Roy designed REST for the Web (with zero enterprise requirements) and who could believe that serendipitously Roy’s REST would match perfectly to every and all Enterprise requirements?

    Is it really that much more work to define a precise identity mechanism?

    JJ-

    • IMO, there is a bit of confusion about entities used in DB and other backend systems, and how they manifest into resources. The URI as an identity mechanism applies to resources. In cases when there is a strict and straight-forward mapping between URIs and entities, URIs are sufficient for identity, and I think that is the case Stefan may be referring to. But, I do think that there are a large number of cases where the mapping is not one-to-one, and hence those cases do require identifiers. Atom has recognized this years ago. I bet this will become a norm as we gain more experience.

  • Related Content by Tag