HTTP Protocol

Posted by Tully on Tue 21 July 2009

HTTP Requests

All HTTP messages (requests and responses) consist of one or
more headers, each on a separate line, followed by a mandatory blank line, followed by an optional message body.

The first line of every HTTP request consists of three items, separated by spaces:

A verb which indicates the HTTP method. The most commonly used method is GET, whose function is to retrieve a resource from the web server. GET requests do
not have a message body, so there is no further data following the blank
line after the message headers.

The requested URL. The URL functions as a name for the resource being
requested, together with an optional query string containing parameters that the client is passing to that resource.

The HTTP version being used. The only HTTP versions in common use on the Internet are 1.0 and 1.1, and most browsers use version 1.1 by default.
There are a few differences between the specifications of these two versions; however, the only difference you are likely to encounter when attacking web applications is that in version 1.1 the host request header is mandatory.

HTTP Responses

The first line of every HTTP response consists of three
items, separated by spaces:

The HTTP version being used.

A numeric status code indicating the result of the request. 200 is the most common status code; it means that the request was successful and the requested resource is being treturned.

A textual "reason phrase" further describing the status of the response. This can have any value and is not used for any purpose by current
browsers.

HTTP Methods

The two main methods are GET and POST. There are others such
as the following:

HEAD

This functions in the same way as a GET request except that the server
should not return a message body in its response. The server should return
the same headers that it would have returned to the corresponding GET
request. Hence, this method can be used for checking whether a resource is
present before making a GET request for it.

TRACE

This method is designed for diagnostic purposes. The server should
return in the response body the exact contents of the request message that
it received. This can be used to detect the effect of any proxy servers
between the client and server that may manipulate the request. It can also
sometimes be used as part of an attack against other applications users.

OPTIONS

This method asks the server to report the HTTP methods that are
available for a particular resource. The server will typically return a
response containing an Allow header that lists the available methods.

PUT

This method attempts to upload the specified resource to the server, using
the content contained in the body of the request. If this method is
enabled, then you may be able to leverage it to attack the application;
for example, by uploading an arbitrary script and executing this on the server.

NOTE: The correct technical term for a URL is actually URI (or
uniform resource identifier), but this term is really only used in formal specifications.)

HTTP Headers

HTTP header fields provide required information about the request or response, or about the object sent in the message body.

General Headers


Connection

This is used to inform the other end of the communication whether it should close the TCP connection after the HTTP transation has completed or keep it open for further messages.

Content-Encoding

This is used to specify what kind of encoding is being used for the content contained in the message body, such as gzip, which is used by some applications to compress responses for faster transmission.

Content-Length

This is used to specify the length of the message body, in bytes (except in the case of responses to HEAD requests, when it indicates the length of the body in the response to the corresponding GET request).

Content-Type

This is used to specify the type of content contained in the message body; for example, text/html for HTML documents.

Transfer-Encoding

This is used to specify any encoding that was performed on the message body to facilitate its transfer over HTTP. It is normally used to specify chunked encoding when this is employed.

Request Headers


Accept

This is used to tell the server what kinds of content the client is willing to accept, such as image types, office document formats, and so on.

Accept-Encoding

This is used to tell the server what kinds of content encoding the client is willing to accept.

Authorization

This is used to submit credentials to the server for one of the built-in HTTP authentication types.

Cookie

This is used to submit cookies to the server which were previously issued by it.

Host

This is used to specify the hostname that appeared in the full URL being requested.

If-Modified-Since

This is used to specify the time at which the browser last received the requested resource. If the resource has not changed since that time, the server may instruct the client to use its cached copy, using a response with status code 304.

If-None-Match

This is used to specify an entity tag, which is an identifier denoting
the contents of the message body. The browser submits the entity tag that
the server issued with the requested resource when it was last received.
The server can use the entity tag to determine whether the browser may use its cached copy of the resource.

Referer

This is used to specify the URL from which the current request
originated.

User-Agent

This is used to provide information about the browser or other client software that generated the request.

Response Headers


Cache-Control

This is used to pass caching directives to the browser (for example, no-cache)

ETag

This is used to specify an entity tag. Clients can submit this identifier
in future requests for the same resource in the If-None-Match header to
notify the server which version of the resource the browser currently holds in its cache.

Expires

This is used to instruct the browser how long the contents of the message
body are valid for. The browser may use the cached copy of this resource
until this time.

Location

This is used in redirection responses (those with a status code starting
with a 3) to specify the target of the redirect.

Pragma

This is used to pass caching directive to the browser (for example, no cache).

Server

This is used to provide information about the web server software being used.

Set-Cookie

This is used to issue cookies to the browser that it will submit back to the server in subsequent requests.

WWW-Authenticate

This is used in responses with a 401 status code to provide details of
the type(s) of authentication supported by the server.

Cookies

The cookie mechanism enables the server to send items of data to the client, which the client stores and resubmits back to the server.

Expires

Used to set a date until which the cookie is valid. This will cause the browser to save the cookie to persistent storage, and it will be reused in subsequent browser sessions until the expiration date is reached. If this attribute is not set, the cookie is used only in the current browser session.

Domain

Used to specify the domain for which the cookie is valid. This must be the same or a parent of the domain from which the cookie is received.

Path

Used to specify the URL path for which the cookie is valid.

Secure

If this attribute is set, the the cookie will only ever be submitted in the HTTPS requests.

HttpOnly

If this attribute is set, then the cookie cannot be directly accessed
via client-side JavaScript, although not all browsers support this
restriction.

HTTP Status Codes

Each HTTP response message must contain a status code in its
first line, indicating the result of the request. The status codes fall into five groups, accoding to the first digit of the code.

1xx - Informational.

2xx - The request was successful.

3xx - The client is redirected to a different resource.

4xx - The request contains an error of some kind.

5xx - The server encountered an error fulfilling the
request.

Below is a list of common HTTP status codes:

100 Continue

This response is sent in some circumstances
when a client submits a request containing a body. The response indicates that
the request headers were received and that the client should continue sending the body. The server will then return a second response when the request has been completed.

200 Ok

This indicates that the request was successful and
the response body contains the result of the request.

201 Created

This is returned in response to a PUT request
to indicate that the request was successful.

301 Moved Permanently

This redirects the browser permanently to a different URL, which is specified in the Location header. The client should use the new URL in the future rather than the original.

302 Found

This redirects the browser temporarily to a
different URL, which is specified in the Location header. The client should revert to the original URL in subsequent requests.

304 Not Modified

This instructs the browser to use its
cached copy of the requested resource. The server uses the If-Modified-Since and If-None-Match request headers to determine whether the client has the latest version of the resource.

400 Bad Request

This indicates that the client submitted
an invalid HTTP request. You will probably encounter this when you have
modified a request in certain invalid ways, for example by placing a space character into the URL.

401 Unauthorized

The server requires HTTP authentication before the request will be granted. The WWW-Authenticate header contains details of the type(s) of authentication supported.

403 Forbidden

This indicates that no one is allowed to
access the requested resource, regardless of authentication.

404 Not Found

This indicates that the requested resource does not exist.

405 Method Not Allowed

This indicates that the method used in the request is not supported for the specified URL. For example, you may
receive this status code if you attempt to use the PUT method where it is not supported.

413 Request Entity Too Large

If you are probing for buffer overflow vulnerabilities in native code, and so submitting long strings of data, this indicates that the body of your request is too large for the server to handle.

414 Request URI Too Long

Similar to the previous response, this indicates that the URL used in the request is too large for the server to handle.

500 Internal Server Error

This indicates that the server encountered an error fulfilling the request. This normally occurs when you have
submitted unexpected input that caused an unhandled error somewhere within the application's processing. You should review the full contents of the server's response closely for any details indicating the nature of the error.

503 Service Unavailable

This normally indicates that, although the web server itself is functioning and able to respond to requests,
the application accessed via the server is not responding. You should verify whether this is the result of any action that you have performed.

NOTE - SSL has now strictly been superseded by transport layer
security (TLS), but the latter is still normally referred to using the older
name.

HTTP Authentication

The HTTP protocol includes its own mechanisms for
authenticating users, using various authentication schemes, including:

Basic

This is a very simple authentication mechanism that sends user credentials as a Base64-encoded string in a request header with each message.

NTLM

This is a challenge-response mechanism and uses a version of the Windows NTLM protocol.

Digest

This is a challenge-response mechanism and uses MD5 checksums of a nonce with the user's credentials.