in Uncategorized

HTTP Pipelining is a Hit and Miss

The idea behind pipelining in HTTP 1.1 is to allow clients submit multiple idempotent HTTP requests over a connection to a host, let the server to process them in parallel and respond in the order in which the requests were made.

In theory, pipelining has some benefits:

  • Increased parallelism on the server – this seems obvious since the client is able to dispatch many requests over a connection.
  • Reduced amount of the time an HTTP request is waiting to be made in the browser – in stead of queuing up requests behind one another, the browser can dispatch them as soon as it knows that a resource needs to be fetched
  • Improved connection reuse – as a consequence of increased parallelism and reduction in queued times, it is arguable that pipelining would result in a reduction in number of connections that the browser needs to open.

These may result in reduced latencies. However, finding data with consistent patterns to prove benefits of pipelining is hard to come by. The first set of charts below show total time requests were in the queue for each connection. The red circles show requests made without pipelining and green circles show requests made with pipelining enabled. The size of each circle is proportional to the number of resources requested for each connection. The results show 100 runs made from a pool of private instances of WebPagetest on Windows XP with Firefox over DSL profile.

Any pattern showing that pipelining reduces queued times is absent in these charts.

The second set of charts below show the number of connections used by the browser for each run. The red lines show runs without pipelining, and green lines show runs made with pipelining.

Again there is no consistent pattern in these charts to show the pipelining improves connection reuse.

The point is here is that though the idea of pipelining is simple in theory, making tangible benefits out of pipelining may be hard.

Write a Comment

Comment

17 Comments

  1. Subbu,

    What version of Firefox was used? Almost all of the really interesting stuff that Patrick has done isn’t yet in release, AIUI.

    Would be interesting to see this done with Chrome dev channel, which is also playing with pipelining…

    • Mark – these were done using Firefox 14.0.1.

      I tried Chrome, but AFAICT, Chrome enables pipelining adaptively after several runs, but I wanted to control when to enable and when to disable.

  2. Correct me if I’m wrong.. but pipelining shouldn’t win us anything here over keep-alive. If the sites you tested are HTTP 1.1 friendly, then I wouldn’t expect pipelining to reduce the number of connections. Also, if we assume that most of the requests are for static resources, which are likely at most a disk seek away (if not in memory, or some file cache already), then the “win” of pipelining will be effectively bounded by the one way latency to the server (cost of firing request header on keep-alive). So, indeed, pipelining shouldn’t affect connection reuse over keep-alive..

      • I agree with Dan’s argument. As he pointed out: “let us assume that the time to generate the resources dwarfs the time to transfer the request and responses across the network. Let T(R1) and T(R2) represent the time required to generate R1 and R2, respectively.”

        That’s where pipelining shines. If your argument is that we can get faster response by simply opening two parallel connections.. then that may be true, in certain cases depending on what R1 and R2 are, the time it takes the server to generate the response, *if* we can actually open the extra connection (many sites are blocked on connection limits), and so on. It doesn’t surprise me that when you picked high-traffic sites, we don’t see any magical wins: they’re already optimized by having very low TTFB for resources, they’ve sharded their resources, etc.

        Pipelining is no magic pixie dust. But we can build applications which take advantage of it when it’s available. Just because we may not see an immediate win by enabling it today, doesn’t mean it’s not useful.

        • “If your argument is that we can get faster response by simply opening two parallel connections.. then that may be true,…”

          That is unlikely. One possibility is that reduced queue time may help reduce user perceived latency.

          “Pipelining is no magic pixie dust. But we can build applications which take advantage of it when it’s available. Just because we may not see an immediate win by enabling it today, doesn’t mean it’s not useful.”

          True. Hope to see more data and guidance, particularly since Chrome has this enabled for about a year. Actually, when I started with these tests, I had a better story in mind – look for metrics that show pipelining in better light, and then lead to SPDY as pipelining done right. But data with these sites did not help me :)

          I have no doubt about SPDY, but pipelining still seems iffy.

  3. Note that, if you test against very popular sites, you are going to get interference from the thousands of tweaks that such a site uses in practice to improve performance for non-pipelining clients. If you test against a site crafted to maximize performance from pipelining, you get very different results. Since neither is an accurate assessment of what pipelining would be like if deployed in practice, the best we can do is test both (take a normal site, test it, and then rearrange the site to be pipelining friendly and test again). The same will be true for testing a multiplexed protocol.

    • That is perhaps true, but I’m not aware of optimizations that are typically done with pipelining in mind. Although the charts don’t show, the big circles are requests to CDNs for static assets which are better candidates for pipelining as the resources are static with similar sizes.

  4. @Roy – I doubt if it is possible to carefully craft a site/test to show better outcome from pipelining on the Web. It is certainly possible for hand-crafted HTTP client apps, but hard to optimize sites for pipelining.

  5. What was the latency from your test client to the servers?

    This matters a lot; when you cross an ocean, the latency + connection limits cause a waterfall of requests; we used to see that lots at Y! debugging perf.

    And, of course, you’re testing pipelining implementations, not the concept. You really, really need to talk to Patrick about *what* you’re testing.

    And, remember that testing implementation != testing pipelining itself; all of the implementations are evolving rapidly, as interest in pipelining is quite recent.

    • Good points, and I really wanted to pipelining to shine. But I could not see it today. On a related note, I could see effects SPDY more easily than pipelining though I don’t have publishable data on SPDY.

      @guypod mentioned today that he had some data pipelining to draw similar conclusions and is planning to publish next week.

      Agree on the concept, but does it matter? :)