Idempotency Matters

by Subbu Allamaraju on January 16, 2012 · 12 comments

Here is an example of how idempotency can save end users a precious second or so.

Several weeks ago, one of the teams using ql.io reported that they were seeing random socket hang up errors in server logs.

error: { stack: [Getter/Setter],
 arguments: undefined,
 type: undefined,
 message: 'socket hang up',
 uri:
  { value: 'http://clientalerts.ebay.com/...'

The source of this log was the error handler of the http client

request = http.request(...);
request.on('error', function(err) {
    // log the error
});

This error was so random that we could not find a reproducer on our dev or testing environments for a while. We then wrote a ql.io script that emulates the production use case, ran that script using siege to emulate the production usage, and captured the traffic using tcpdump. While we were to able to capture a few instances of socket hang up errors with this setup, the tcpdump did not give definite answers as there were too much data to sift through. The client (siege) was making requests in parallel to a cluster of node instances with each instance maintaining its own socket pool. But there were two clues:

  • The origin was closing the connection after some usage by sending a FIN packet, but none of the accompanying HTTP responses included the Connection: close header. This is quite common as connection handling is often done at the reverse proxy tier which may not insert this header in the last response before closing the connection.
  • The number of new connections opened during the test was a few less than the number of connections closed. This indicates that some requests were being made on closed connections causing the socket hang up errors.

This lead us to Issue 1135 – a timing related issue in node.js where inflight requests made by the client after the FIN packet was sent caused the socket hang up error. The resolution was to retry the request if possible.

HTTP GET, PUT and DELETE requests are retriable in case of such network errors, and the server implementations of these methods are supposed to support such retries.

But here is the rub. There is a lot of bad deployed HTTP server code out in the wild that might choke when a client retries in case of such network errors. Hence, node.js can not retry automatically without some indication from the client application. In the end, we decided to retry the request when an idempotent HTTP request caused by a select statement encounters network failures such as the above. This helped us save an extra roundtrip from the client to ql.io.

This is a reminder of why idempotency is a necessary promise to keep in HTTP server code.

{ 10 comments… read them below or add one }

Ruben Verborgh (@RubenVerborgh) (@RubenVerborgh) January 16, 2012 at 3:01 am

@sallamar reminds us why idempotency is a desirable (I would say: necessary) promise to keep in HTTP servers – http://t.co/GMSH9XaF

Reply

Subbu Allamaraju January 16, 2012 at 9:18 am

+1 to “necessary”

Reply

Mike Amundsen (@mamund) (@mamund) January 16, 2012 at 5:40 am

RT @sallamar: An example of why idempotency matters – http://t.co/BHKoXEUU #yam

Reply

Assaf Arkin (@assaf) January 16, 2012 at 8:01 am

RT @sallamar: An example of why idempotency matters – http://t.co/BHKoXEUU #yam

Reply

rbergman (@rbergman) (@rbergman) January 16, 2012 at 10:00 am

RT @sallamar: An example of why idempotency matters – http://t.co/BHKoXEUU #yam

Reply

Shashi Velur (@shashivelur) January 16, 2012 at 1:56 pm

RT @sallamar: An example of why idempotency matters – http://t.co/tLq7AazF #yam

Reply

Frank Denis (@jedisct1) (@jedisct1) January 16, 2012 at 2:13 pm

Idempotency Matters http://t.co/31AXG5Pv

Reply

Régis Gaidot (@rgaidot) (@rgaidot) January 17, 2012 at 2:45 pm

RT @sallamar An example of why idempotency matters – http://t.co/DoVt8L4L #yam

Reply

Arialdo Martini (@arialdomartini) (@arialdomartini) January 18, 2012 at 10:52 pm

@sallamar In your article on idempotency http://t.co/4uCvkdZn there’s a little typo: radnom => random

Reply

Jorge Ferrer (@jorgeferrer) January 29, 2012 at 5:06 am

Nice article for developers designing web services: Idempotency Matters http://t.co/l4p7hoVP

Reply

Leave a Comment

Previous post:

Next post: