I must have had some reason for wanting to do this, although I can’t think of why right now. curl is an excellent tool for ad hoc HTTP requests.
On a server running Apache 2.4.6, first I tried:
# nc 127.0.0.1 80
GET / HTTP/1.1
Which returned a HTTP/1.1 400 Bad Request
error.
Next I tried:
# printf "GET /index.html HTTP/1.1\r\n\r\n" | nc 127.0.0.1 80
Which also returned a HTTP/1.1 400 Bad Request
error.
I decided to take a look at what curl was sending, since that was working:
# curl -v http://127.0.0.1
* About to connect() to 127.0.0.1 port 80 (#0)
* Trying 127.0.0.1...
* Connected to 127.0.0.1 (127.0.0.1) port 80 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.29.0
> Host: 127.0.0.1
> Accept: */*
...
I put the same headers (with a modified User-Agent
) into my printf
statement:
# printf "GET /index.html HTTP/1.1\r\nUser-Agent: nc/0.0.1\r\nHost: 127.0.0.1\r\nAccept: */*\r\n\r\n" | nc 127.0.0.1 80
HTTP/1.1 200 OK
Date: Sun, 28 Jan 2018 23:11:04 GMT
Server: Apache/2.4.6 (CentOS) PHP/5.4.16
Last-Modified: Sun, 28 Jan 2018 20:10:37 GMT
ETag: "78-563dbb912bfe0"
Accept-Ranges: bytes
Content-Length: 120
Content-Type: text/html; charset=UTF-8
<!DOCTYPE html>
<html>
<head>
<title>well that worked</title>
</head>
<body>
<h1>apache is running</h1>
</body>
</html>
That worked!
I eliminated the User-Agent
the Accept
headers and it still worked, so the missing Host
header was the cause of my problems. I swear I’ve done this before without a Host
header though.
I looked up the HTTP specification, and as described in section 5.2 of the RFC:
1. If Request-URI is an absoluteURI, the host is part of the Request-URI. Any Host header field value in the request MUST be ignored.
2. If the Request-URI is not an absoluteURI, and the request includes a Host header field, the host is determined by the Host header field value.
3. If the host as determined by rule 1 or 2 is not a valid host on the server, the response MUST be a 400 (Bad Request) error message.
Recipients of an HTTP/1.0 request that lacks a Host header field MAY attempt to use heuristics (e.g., examination of the URI path for something unique to a particular host) in order to determine what exact resource is being requested.
I could not get it to work with an absoluteURI
, even using the example in the RFC. However I did find that I could ignore the Host
header if I specified HTTP/1.0:
# printf "GET / HTTP/1.0\r\n\r\n" | nc 127.0.0.1 80
I also found that Apache didn’t care what the Host header was when using HTTP/1.1, just so long as something was there:
# printf "GET / HTTP/1.1\r\nHost: z\r\n\r\n" | nc 127.0.0.1 80
That’s a little odd. I did not specify a ServerName
in my Apache config, but even after I specified ServerName 127.0.0.1:80
in /etc/httpd/conf/httpd.conf
and restarted Apache, it still required the Host
header and it still didn’t care what the content of the Host
header was (so long as it was not empty).