Persistent Connections, Pipelining and Chunking
The Content-Length header is an interesting HTTP response header. This header tells HTTP client applications the size of the response. However, in HTTP 1.1, this header is optional. HTTP applications need not set this header if the size is not known. This is in fact the case with most applications that generate dynamic content.
When the Content-Length header is absent, how will HTTP clients be able to know how much to read? One obvious answer is to read till the end of the stream is reached, as in
InputStream inputStream = connection.getInputStream();
int b;
while(-1 != (b = inputStream.read()) {
// Do something
}
But this will not work (without some additional work at a lower level) in this simple form when combined with two other features of HTTP 1.1, viz. persistent connections, and pipelining.
Persistent Connections
By default, HTTP 1.1 connections are persistent. HTTP servers do not close the connection immediately after serving the response, thus allowing the client to reuse the connection. Persistent connections are very useful when a client is fetching several resources from the same server over a short amount of time. Instead of creating a new connection for each resource, the client can reuse a single connection for all resources.
When persistent connections are in use, there are three ways a connection can be closed:
- The server can return a header Connection: close. This header indicates that the client should not assume that the connection is persistent after the current request is completed.
- The client can send a header Connection: close if it does not want to maintain the connection after the current request is completed.
- Timeout, or network failures.
If the client were to wait for each response before sending another request over a persistent connection, we can ignore the Content-Length header and try to read till the end of the stream. But it isn’t so, because connections can be pipelined.
Pipelining
With HTTP 1.1, the client need not wait for the current request to
complete before sending the next request over a persistent connection. It can send several requests over the same connection without waiting for responses for previous requests. This is called pipelining.
Here is a sample with three requests pipelined over a persistent
connection.
GET /HTTP/1.1 Host: www.subbu.org User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0 Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5 GET http://www.subbu.org/counter.gif HTTP/1.1 Host: www.subbu.org User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0 Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5 GET http://www.subbu.org/styles.css HTTP/1.1 Host: www.subbu.org User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0 Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
The client can send several requests in sequence, and the server is supposed to send a response to each request in the same order the request was received. For the above requests, here is a sample set of responses from the server:
HTTP/1.1 200 OK
Date: Mon, 29 Nov 2004 01:45:35 GMT
Server: Apache
Last-Modified: Thu, 25 Nov 2004 02:32:28 GMT
Content-Length: 16291
Content-Type: text/html
<html>
<!-- Actual content snipped -->
</html>
HTTP/1.1 200 OK
Date: Mon, 29 Nov 2004 01:45:36 GMT
Transfer-Encoding: chunked
Content-Type: text/gif
255
[some bytes here]
104
[some bytes here]
0
HTTP/1.1 200 OK
Date: Mon, 29 Nov 2004 01:45:36 GMT
Transfer-Encoding: chunked
Content-Length: 2341
Content-Type: text/css
body {
# Actual content snipped
The server included a Content-Length header for the first and third requests, and therefore the client knows the number of bytes to read for these requests. But the server did not include a Content-Length header for the second response as the resource was dynamically generated.
Since the server did not include a
Content-Length header for the second response, the while loop to read all the available bytes fails for the second request. A client using such a while loop to read the stream till the end would incorrectly include the third response (and all subsequent responses) in the second response.
HTTP 1.1 solves this problem by chunking.
Chunking
When a HTTP server generates response dynamically, and does not include the Content-Length header, it includes a Transfer-Encoding: chunked header, and sends the response in chunks with known sizes, followed by a chunk of zero size. In the above example, the second response had three chunks. The last chunk in this response indicates the end of the response. With the help of the size of each chunk, and the delimiter chunk at the end, the client can safely determine the end of the response.
Most of the times, these details are automatically taken care by HTTP APIs. Client applications can safely try to read till the end of the stream. For example, the above while loop would work even when the connection is persistent and the server is using chunked response encoding. The underlying connection API (in this case, Java networking APIs) would merge the chunks and return -1 when the last chunk (of size 0) is reached. But if you are writing lower-level HTTP applications, or if a particular API does not support chunking, you must account for chunking explicitly.
Some HTTP applications incorrectly assume that the Content-Length header is required, and determine the length by buffering the response data in memory. But this is expensive and unnecessary. It is valid to skip this header and rely on chunking if the size is not known. But if the underlying network API does not support chunking, applications must make sure to send a Connection: close header so that the connection will not be persistent after the current response. Otherwise, clients supporting persistent connections may misbehave.



Subbu,
This is a good over view of how HTTP connection handling works. Having spent two years implementing an HTTP proxy, I can tell you this is not trivial. For instance Cisco had a buffer overflow problem because they weren’t properly checking the max size of a chunk. The HTTP spec doesn’t limit the size of a chunk, but most real world implementations must do this to prevent buffer overruns or other memory related failures. These are the implementation flaws that hackers love to take advantage of.
Another interesting aspect of proxying is that there is basically two pipelines per client. One for the requests and one for the responses. If it is necessary to communicate between these two pipelines a queue must be used. This can get interesting quickly for features that seem simple at first glance, such as request logging.
Hi Subbu,
Can you give some sample code for Persistent connection on server side.
I am trying to post Multiple requests by using HTTPURLConnection class. But after posting one request, i am not able to make second one.
Thanx in Advance.
Hi Subbu,
Its a good overview of persistant connections. I have one query on this. What should be the behavior of HTTP/1.1 client if HTTP/1.1 Server doesn’t include Connection header in response.
Thanks in advance.
hi,
i am raghu, i am impementing HTTP server using C,till persistent connection i hav finished, next pipelining need to be implemented ,so can u please tell the logic of how to implement pipelining in http server.
Does HTTP RFC comment on the “host” field in the persistent HTTP connections?
I am assuming that “host field will remain same for all GET requests on a persistent HTTP connection”
Am I right in my assumption?
thanks,