Home

HTTP 1.1 Request Pipelining

Written by: NetworkError, on 15-05-2008 15:26
Last update: 15-05-2008 15:36
Published in: Public, Technical Wootness
Views: 11727


When you're out browsing the web, you send an HTTP request for each file and the server returns an HTTP response for each request. Each request/response uses an independent connection.

HTTP 1.1 supports a thing called pipelining. Essentially, you send multiple requests over one connection and the server returns responses in the order they were requested. This allows you to recycle the same connection for consecutive requests.

Why would anyone want to do this? Well, if you're going to do a lot of HTTP requests, you'll quickly realize that connection overhead gets ugly. In fact, any time you make a lot of requests and open a lot of network sockets, you'll burn up more CPU on that than just about anything else. So let's say you're going to programatically download 10000 files from a server over HTTP; it's best to do it with a handful of connections rather than 10000 connections.

Before we get started, please be aware that you can't pipeline POSTs. I know. Lame, right? Anyway...

Normally an HTTP request looks something like this:

GET /index.php HTTP/1.1
Host: www.networkerror.org
User-Agent: Teh FireFox
Connection: Close

And the response looks something like this:
HTTP/1.1 200 OK
Date: Thu, 15 May 2008 22:14:24 GMT
Server: Apache/2.2.3 (Ubuntu) PHP/5.2.1
X-Powered-By: PHP/5.2.1
Connection: close
Transfer-Encoding: chunked
Content-Type: text/html; charset=UTF-8

3caa
<!--!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"-->

<html>
<head>
<title>Test Page</title>
<script src="/js/css_browser_selector.js" type="text/javascript"></script>
<link href="/global.css" rel="stylesheet" type="text/css" />
</head>
...


Let's break the request down line by line.

GET /index.php HTTP/1.1 - GET is the type of request. (GET, POST, etc...) Next comes the file you're requesting. (The path must be fully qualified, not relative). Last comes the protocol version.
Host: www.networkerror.org - The host you're connecting to.
User-Agent: Teh FireFox - Your client's user agent. (Optional.)
Connection: Close - Once the response is received, the connection will be closed by the server.

Let's look a few key pieces of the response that you'll need to be aware of.

HTTP/1.1 200 OK - Indicates that the file is found and will be delivered.
Connection: close - Connection will not be kept alive.
Transfer-Encoding: chunked - The body will be sent in chunks.
Content-Type: text/html; charset=UTF-8 - Character encoding.
3caa - Hexidecimal count of bytes to be returned. In this case, 15530 bytes will be sent, then a carriage return, then another hexidecimal byte count. When you get a byte count of 0, you have reached the end of the body. You'll only see this if you see the Transfer-Encoding: chunked header. Otherwise, you'll see a Content-Length: [bytes] header.

A pipelined request looks the same, except the last line reads like this.

Connection: Keep-Alive - This tells the server to keep the connection open until it times out or you send another request.

For each request you send down the pipeline (separated by at least two carriage return + live feeds), the server will send a full response (headers and body) back to you.

Going along with the image request scenario, you can pipeline multiple GET requests to download multiple images over one connection using requests like this:

GET /image1.jpg HTTP/1.1
Host: www.networkerror.org
User-Agent: Teh FireFox
Connection: Keep-Alive

GET /image2.jpg HTTP/1.1
Host: www.networkerror.org
User-Agent: Teh FireFox
Connection: Keep-Alive

GET /image3.jpg HTTP/1.1
Host: www.networkerror.org
User-Agent: Teh FireFox
Connection: Keep-Alive

GET /image4.jpg HTTP/1.1
Host: www.networkerror.org
User-Agent: Teh FireFox
Connection: Close

When you read the output buffer for this socket, you'll see 4 HTTP responses. For each response, you can easily parse and strip off the headers, then read in the body. The output buffer for the above four requests will look something like this:

HTTP/1.1 200 OK
Date: Thu, 15 May 2008 22:22:44 GMT
Server: Apache/2.2.3 (Ubuntu) PHP/5.2.1
Last-Modified: Fri, 15 Jun 2007 05:31:33 GMT
ETag: "1bd805e-13e8-28f9b740"
Accept-Ranges: bytes
Content-Length: 5096
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: image/jpeg

[binary image data]

HTTP/1.1 200 OK
Date: Thu, 15 May 2008 22:22:44 GMT
Server: Apache/2.2.3 (Ubuntu) PHP/5.2.1
Last-Modified: Fri, 15 Jun 2007 05:31:33 GMT
ETag: "1bd805e-13e8-28f9b740"
Accept-Ranges: bytes
Content-Length: 5096
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: image/jpeg

[binary image data]

HTTP/1.1 200 OK
Date: Thu, 15 May 2008 22:22:44 GMT
Server: Apache/2.2.3 (Ubuntu) PHP/5.2.1
Last-Modified: Fri, 15 Jun 2007 05:31:33 GMT
ETag: "1bd805e-13e8-28f9b740"
Accept-Ranges: bytes
Content-Length: 5096
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: image/jpeg

[binary image data]

HTTP/1.1 404 Not Found
Date: Thu, 15 May 2008 22:20:43 GMT
Server: Apache/2.2.3 (Ubuntu) PHP/5.2.1
Content-Length: 303
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: text/html; charset=iso-8859-1

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>404 Not Found</title>
</head><body>
<h1>Not Found</h1>
<p>The requested URL /image4.jpg was not found on this server.</p>
<hr>
<address>Apache/2.2.3 (Ubuntu) PHP/5.2.1 Server at www.networkerror.org Port 80</address>
</body></html>

When you specify the Connection: Keep-Alive header, the connection will stay alive until one of the following conditions occur:
1. The connection times out. If you go too many seconds without requesting something, the server will close the connection. To work around this, you can send a carriage return once every second or so to keep the connection alive. Apache servers will handle this in stride. (I haven't tried it on other servers.)
2. You send a request with a Connection: Close header.
3. Most servers will only allow a limited number of requests over a single connection before forcing it to close. Apache defaults to 200, if I remember correctly. It's a good idea to limit the number of pipelined requests to under 200. (I usually limit mine to 50.)

If you want to play around with this stuff, try telnetting to any web server (like google) on port 80. Then paste an HTTP request (followed by 2 carriage returns). You should get a full response back.


Read more... Be first to comment this article   |   Print   |   Send to friend