Hypermedia and REST

Of the four interface constraints underlying the REST architecture, the constraint that "hypermedia as the engine of application state" is cause for some confusion. As Stu Charlton once noted, it is also the most often ignored constraint in reality. While this constraint does some offer some benefits, I do see practical reasons why this constraint is ignored, particularly for non-userfacing applications, and there are benefits to ignoring it.

In his thesis, Roy Fielding describes four interface constraints for REST. These are (a) identification of resources through URIs, (b) manipulation of resources through their representations, (c) self-descriptive representations, and (d) hypermedia as the engine of application state.

What does this constraint mean? Referring back to Roy’s thesis,

Distributed hypermedia provides a uniform means of accessing services through the embedding of action controls within the presentation of information retrieved from remote sites.

The rationale behind this fourth constraint is that, by embedding actions controls and URIs within each representation, the client remains independent of the server, and therefore can be universal. The representation that the client receives would include possible new state changes, the URIs and methods that could be used for those transitions, and any associated ordering semantics. As John Heintz recently noted, embedding action controls with each representation yields the following benefits.

  1. The client doesn’t need to have generated (and hard-coded) client stubs
  2. The server can change it’s rules and expand available states as needed
  3. The client does need to process a generically extensible content type

In essence, since the representation will be self-contained, the fourth constraint takes away the need for a shared understanding of any ordering rules between interactions. The shared understanding is embedded within the representation and the client does not have to know it a priori. All the benefits that John points out can be realized with a generic and universal client that can understand a particular set of media-types. For example, a web browser understanding HTML can work within the fourth constraint since it is programmed to understand and interpret the embedded action controls and the algorithms to construct URIs without any server-specific pre-programming. To quote Stu again,

All a consumer requires is a single URI to bootstrap the interaction process.

True. This is the reason why web browsers have become de-facto application clients.

Should this idea be extended to the rest of non-user facing resource-oriented applications? I don’t think so. Here is why.

The idea of hypermedia embedding all the action controls necessary to interact with the server works well for an arbitrary number of universal clients interacting with a given server. In this case, the server offering a set of resources specifies all the ordering/interaction rules within the representation. Most application clients, on the other hand, interact with more than one server, and the ordering constraints can not be set by any given server. The clients know how to compose applications out of resources offered by various servers, and each client needs to be able to exercise control over composition. To be able to exercise such a control, client applications can not be universal, and the benefits that John lists above cannot be completely realized. And that is why, IMHO, the fourth constraint about hypermedia is often ignored.

Write a Comment


  1. Subbu,

    I agree with everything you said until the last paragraph where I have some questions.

    There are several ideas in that paragraph:
    * “universal client”
    * single vs multiple servers
    * ordering constraints from a server

    I suggest we are talking about a universal, but opportunistic, client. If the client has built-in support for OpenSearch, MyShoppingMediaType, or something else then it is smarter wrt media types, but still conforming to RESTful constraints.

    As a set of example systems:
    1) My personal daily task server (allows create/close of tasks)
    2) Several public shopping sites (amazon,
    3) A client agent that talks application/taskserver and application/shopping

    I want my shopping agent to watch a few sites for when the price of automated espresso machines drops below $400. When that happens create a new task for me on my task manager to check out the site with a link.

    The client is responsible for one key decision: when to create a new “task”.

    It makes that decision after interpreting responses from the shopping servers (each with their own ordering/interaction paths for the searching).

    Are you making the point that because two servers are involved the ordering/interactions between the two is no longer RESTful? Or just that such a thing couldn’t be done with “only” a universal client?

    What I’m trying to get at is more knowledge of these two things:
    1) the form of my two example media types
    2) the coding function of my opportunistic client

    I think those are the RESTful (mantaining the benefits of REST) replacement for an API defined by some IDL.


  2. Hi John,

    Thanks for your comments.

    >> Are you making the point that because two servers are involved the ordering/interactions between the
    >> two is no longer RESTful? Or just that such a thing couldn’t be done with “only” a universal client?

    No – the two servers can still be RESTful. In your example, the two servers are provide two distinct services – one to retrieve product data, and the other to manage tasks. The client agent knows about the semantics of how to create a useful service out of these two. Neither server is providing the hypermedia about how to do this integration. That knowledge is built in to your smart agent. That’s my point.