HTTP Requests
All HTTP messages (requests and responses) consist of one or
more headers, each on a separate line, followed by a mandatory blank line,
followed by an optional message body.
The first line of every HTTP request consists of three
items, separated by spaces:
- A very
indicated the HTTP method. The most commonly used method is GET, whose
function is to retrieve a resource from the web server. GET requests do
not have a message body, so there is no further data following the blank
line after the message headers.
- The
requested URL. The URL functions as a name for the resource being
requested, together with an optional query string containing parameters
that the client is passing to that resource.
- The
HTTP version being used. The only HTTP versions in common use on the
Internet are 1.0 and 1.1, and most browsers use version 1.1 by default.
There are a few differences between the specifications of these two
versions; however, the only difference you are likely to encounter when
attacking web applications is that in version 1.1 the host request header
is mandatory.
HTTP Responses
The first line of every HTTP response consists of three
items, separated by spaces:
- The
HTTP version being used.
- A
numeric status code indicating the result of the request. 200 is the most
common status code; it means that the request was successful and the
requested resource is being treturned.
- A
textual “reason phrase” further describing the status of the response.
This can have any value and is not used for any purpose by current
browsers.
HTTP Methods
The two main methods are GET and POST. There are others such
as the following:
- HEAD -
This functions in the same way as a GET request except that the server
should not return a message body in its response. The server should return
the same headers that it would have returned to the corresponding GET
request. Hence, this method can be used for checking whether a resource is
present before making a GET request for it.
- TRACE
- This method is designed for diagnostic purposes. The server should
return in the response body the exact contents of the request message that
it received. This can be used to detect the effect of any proxy servers
between the client and server that may manipulate the request. It can also
sometimes be used as part of an attack against other applications users.
- OPTIONS
- This method asks the server to report the HTTP methods that are
available for a particular resource. The server will typically return a
response containing an Allow header that lists the available methods.
- PUT -
This method attempts to upload the specified resource to the server, using
the content contained in the body of the request. If this method is
enabled, then you may be able to leverage it to attack the application;
for example, by uploading an arbitrary script and executing this on the
server.
[NOTE]
The correct technical term for a URL is actually URI (or
uniform resource identifier), but his term is really only used in formal
specifications.)
HTTP Headers
General Headers
- Connection
- This is used to inform the other end of the communication whether it
should close the TCP connection after the HTTP transation has completed or
keep it open for further messages.
- Content
Encoding – This is used to specify what kind of encoding is being used for
the content contained in the message body, such as gzip, which is used by
some applications to compress responses for faster transmission.
- Content-Length
- This is used to specify the length of the message body, in bytes (except
in the case of responses to HEAD requests, when it indicates the length of
the body in the response to the corresponding GET request).
- Content-Type
- This is used to specify the type of content contained in the message
body; for example, text/html for HTML documents.
- Transfer-Encoding
- This is used to specify any encoding that was performed on the message
body to facilitate its transfer over HTTP. It is normally used to specify
chunked encoding when this is employed.
Request Headers
- Accept
- This is used to tell the server what kinds of content the client is
willing to accept, such as image types, office document formats, and so
on.
- Accept-Encoding
- This is used to tell the server what kinds of content encoding the
client is willing to accept.
- Authorization
- This is used to submit credentials to the server for one of the built-in
HTTP authentication types.
- Cookie
- This is used to submit cookies to the server which were previously
issued by it.
- Host -
This is used to specify the hostname that appeared in the full URL being
requested.
- If-Modified-Since
- This is used to specify the time at which the browser last received the
requested resource. If the resource has not changed since that time, the
server may instruct the client to use its cached copy, using a response
with status code 304.
- If-None-Match
- This is used to specify an entity tag, which is an identifier denoting
the contents of the message body. The browser submits the entity tag that
the server issued with the requested resource when it was last received.
The server can use the entity tag to determine whether the browser may use
its cached copy of the resource.
- Referer
- This is used to specify the URL from which the current request
originated.
- User-Agent
- This is used to provide information about the browser or other client
software that generated the request.
Response Headers
- Cache-Control
- This is used to pass caching directives to the browser (for example,
no-cache)
- ETag -
This is used to specify an entity tag. Clients can submit this identifier
in future requests for the same resource in the If-None-Match header to
notify the server which version of the resource the browser currently
holds in its cache.
- Expires
- This is used to instruct the browser how long the contents of the message
body are valid for. The browser may use the cached copy of this resource
until this time.
- Location
- This is used in redirection responses (those with a status code starting
with a 3) to specify the target of the redirect.
- Pragma
- This is used to pass caching directive to the browser (for example, no
cache).
- Server
- This is used to provide information about the web server software being
used.
- Set-Cookie
- This is used to issue cookies to the browser that it will submit back to
the server in subsequent requests.
- WWW-Authenticate
- This is used in responses with a 401 status code to provide details of
the type(s) of authentication supported by the server.
Cookies
- The
cookie mechanism enabled the server to send items of data to the client,
which the client stores and resubmits back to the server.
- Expires
- Used to set a date until which the cookie is valid. This will cause the
browser to save the cookie to persistent storage, and it will be reused in
subsequent browser sessions until the expiration date is reached. If this
attribute is not set, the cookie is used only in the current browser
session.
- Domain
- Used to specify the domain for which the cookie is valid. This must be
the same or a parent of the domain from which the cookie is received.
- Path -
Used to specify the URL path for which the cookie is valid.
- Secure
- If this attribute is set, the the cookie will only ever be submitted in
the HTTPS requests.
- HttpOnly
- If this attribute is set, then the cookie cannot be directly accessed
via client-side JavaScript, although not all browsers support this
restriction.
Status Codes
Each HTTP response message must contain a status code in its
first line, indicating the result of the request. The status codes fall into
five groups, accoding to the first digit of the code.
1xx – Informational.
2xx – The request was successful.
3xx – The client is redirected to a different resource.
4xx – The request contains an error of some kind.
5xx – The server encountered an error fulfilling the
request.
Some of the common Status Codes you will encounter when
trying to hack a web application are as follows:
100 Continue – This response is sent in some circumstances
when a client submits a request containing a body. The response indicates that
the request headers were received and that the client should continue sending
the body. The server will then return a second response when the request has
been completed.
200 Ok – This indicates that the request was successful and
the response body contains the result of the request.
201 Created – This is returned in response to a PUT request
to indicate that the request was successful.
301 Moved Permanently – This redirects the browser
permanently to a different URL, which is specified in the Location header. The
client should use the new URL in the future rather than the original.
302 Found – This redirects the browser temporarily to a
different URL, which is specified in the Location header. The client should
revert to the original URL in subsequent requests.
304 Not Modified – This instructs the browser to use its
cached copy of the requested resource. The server uses the If-Modified-Since
and If-None-Match request headers to determine whether the client has the
latest version of the resource.
400 Bad Request – This indicates that the client submitted
an invalid HTTP request. You will probably encounter this when you have
modified a request in certain invalid ways, for example by placing a space
character into the URL.
401 Unauthorized – The server requires HTTP authentication
before the request will be granted. The WWW-Authenticate header contains
details of the type(s) of authentication supported.
403 Forbidden – This indicates that no one is allowed to
access the requested resource, regardless of authentication.
404 Not Found – This indicates that the requested resource
does not exist.
405 Method Not Allowed – This indicates that the method used
in the request is not supported for the specified URL. For example, you may
receive this status code if you attempt to use the PUT method where it is not
supported.
413 Request Entity Too Large – If you are probing for buffer
overflow vulnerabilities in native code, and so submitting long strings of
data, this indicates that the body of your request is too large for the server
to handle.
414 Request URI Too Long – Similar to the previous response,
this indicates that the URL used in the request is too large for the server to
handle.
500 Internal Server Error – This indicates that the server
encountered an error fulfilling the request. This normally occurs when you have
submitted unexpected input that caused an unhandled error somewhere within the
application’s processing. You should review the full contents of the server’s
response closely for any details indicating the nature of the error.
503 Service Unavailable – This normally indicates that,
although the web server itself Is functioning and able to respond to requests,
the application accessed via the server is not responding. You should verify
whether this is the result of any action that you have performed.
[NOTE]
SSL has now strictly been superseded by transport layer
security (TLS), but the latter is still normally referred to using the older
name.
HTTP Authentication
The HTTP protocol includes its own mechanisms for
authenticating users, using various authentication schemes, including:
Basic – This is a
very simple authentication mechanism that sends user credentials as a Base64-encoded string in a request header
with each message.
NTLM - This is a
challenge-response mechanism and uses a version of the Windows NTLM protocol.
Digest - This is
a challenge-response mechanism and uses MD5 checksums of a nonce with the
user’s credentials.