Using nc (netcat) to make an HTTP request

I must have had some reason for wanting to do this, although I can’t think of why right now. curl is an excellent tool for ad hoc HTTP requests.

On a server running Apache 2.4.6, first I tried:

# nc 127.0.0.1 80
GET / HTTP/1.1

Which returned a HTTP/1.1 400 Bad Request error.

Next I tried:

# printf "GET /index.html HTTP/1.1\r\n\r\n" | nc 127.0.0.1 80

Which also returned a HTTP/1.1 400 Bad Request error.

I decided to take a look at what curl was sending, since that was working:

# curl -v http://127.0.0.1
* About to connect() to 127.0.0.1 port 80 (#0)
* Trying 127.0.0.1...
* Connected to 127.0.0.1 (127.0.0.1) port 80 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.29.0
> Host: 127.0.0.1
> Accept: */*
...

I put the same headers (with a modified User-Agent) into my printf statement:

# printf "GET /index.html HTTP/1.1\r\nUser-Agent: nc/0.0.1\r\nHost: 127.0.0.1\r\nAccept: */*\r\n\r\n" | nc 127.0.0.1 80
HTTP/1.1 200 OK
Date: Sun, 28 Jan 2018 23:11:04 GMT
Server: Apache/2.4.6 (CentOS) PHP/5.4.16
Last-Modified: Sun, 28 Jan 2018 20:10:37 GMT
ETag: "78-563dbb912bfe0"
Accept-Ranges: bytes
Content-Length: 120
Content-Type: text/html; charset=UTF-8

<!DOCTYPE html>
<html>
<head>
<title>well that worked</title>
</head>
<body>
<h1>apache is running</h1>
</body>
</html>

That worked!

I eliminated the User-Agent the Accept headers and it still worked, so the missing Host header was the cause of my problems. I swear I’ve done this before without a Host header though.

I looked up the HTTP specification, and as described in section 5.2 of the RFC:

1. If Request-URI is an absoluteURI, the host is part of the Request-URI. Any Host header field value in the request MUST be ignored.

2. If the Request-URI is not an absoluteURI, and the request includes a Host header field, the host is determined by the Host header field value.

3. If the host as determined by rule 1 or 2 is not a valid host on the server, the response MUST be a 400 (Bad Request) error message.

Recipients of an HTTP/1.0 request that lacks a Host header field MAY attempt to use heuristics (e.g., examination of the URI path for something unique to a particular host) in order to determine what exact resource is being requested.

I could not get it to work with an absoluteURI, even using the example in the RFC. However I did find that I could ignore the Host header if I specified HTTP/1.0:

# printf "GET / HTTP/1.0\r\n\r\n" | nc 127.0.0.1 80

I also found that Apache didn’t care what the Host header was when using HTTP/1.1, just so long as something was there:

# printf "GET / HTTP/1.1\r\nHost: z\r\n\r\n" | nc 127.0.0.1 80

That’s a little odd. I did not specify a ServerName in my Apache config, but even after I specified ServerName 127.0.0.1:80 in /etc/httpd/conf/httpd.conf and restarted Apache, it still required the Host header and it still didn’t care what the content of the Host header was (so long as it was not empty).

Making a Bootable USB from Mac OSX

I’m running Mac OS Sierra and I needed to make a bootable CentOS 7 USB stick.

I downloaded the minimal ISO and proceeded to follow the instructions at How to Copy an ISO to a USB Drive from Mac OS X with dd, but it never worked. The server never recognized the USB stick as valid media.

At first, I thought it might have had to do with the formatting of the USB stick, which was FAT32. So I tried Mac OS Extended and Extended FAT, but that didn’t help either.

As mentioned by a couple of the comments on that page, I tried writing to disk2 instead of disk2s1 (keep in mind that the USB key on your system may be a different disk — use diskutil list to help identify it):

$ diskutil unmount disk2s1
disk2s1 was already unmounted
$ sudo dd if=~/Downloads/CentOS-7-x86_64-Minimal-1611.iso of=/dev/rdisk2 bs=1m

Once it finishes copying, it should look like this:
$ diskutil list

/dev/disk2 (external, physical):
#: TYPE NAME SIZE IDENTIFIER
0: FDisk_partition_scheme *16.1 GB disk2
1: 0xEF 6.4 MB disk2s2

Block an IP address via iptables

I was monitoring the mail logs on a Postfix server and noted repeated failed connection attempts from the same IP address. The source was likely up to no good, and it was making it more difficult to monitor the logs for legitimate connections, so I decided to block it:

iptables -A INPUT -s 123.456.789.101 -j DROP

(IP address changed to protect…the innocent?)

However, the IP address was still making connections:
Dec 2 17:19:05 mercutio postfix/smtpd[15230]: connect from unknown[123.456.789.101]
Dec 2 17:19:06 mercutio postfix/smtpd[15230]: lost connection after AUTH from unknown[123.456.789.101]
Dec 2 17:19:06 mercutio postfix/smtpd[15230]: disconnect from unknown[123.456.789.101]

How is that possible? First I checked iptables to check my sanity and confirm that the rule had been added:

# iptables -L
...
DROP all -- 123.456.789.101 anywhere
...

OK, it’s there. That’s good!

The problem in this case was a different rule that had been added previously. Rules in iptables are processed in order, and no further rules are processed after a matching rule is found. Well above my newly-added rule was this rule:
ACCEPT tcp -- anywhere anywhere state NEW tcp dpt:smtp

That rule makes sense for a mail server, but I needed my rule to be inserted before it. I determined which rule it was in the INPUT chain like this:
iptables --line-numbers -L INPUT

It was the 5th rule, so I was able to insert the new rule just above it like this:
iptables -I INPUT 4 -s 123.456.789.101 -j DROP

After that, the offending IP address stopped creating entries in the mail.log.

However, my new rule would disappear after a system restart. Since I am using iptables-persistent, I saved the rules to the config file:
iptables-save > /etc/iptables/rules.v4

To confirm everything worked, I attempted to restart iptables:
# service iptables-persistent restart
Failed to restart iptables-persistent.service: Unit iptables-persistent.service

Apparently the service name changed to netfilter-persistent in Debian 8. The config files are still in the same location, but the service name has changed.

I restarted iptables:
# service netfilter-persistent restart

I checked the rules again and my new rule was there, above the rule allowing connections from any IP on port 25. However, I also noticed the following rule above either of those:
ACCEPT all -- anywhere anywhere

I freaked out. That rule indicates that all traffic from any source on any port should be accepted. That’s the worst firewall rule I’ve ever seen. It basically negates the entire concept of a firewall. It clearly should not be there!

However, using the verbose switch on iptables:
iptables -vL INPUT

I discovered that the rule only applied to the lo interface (loopback). That’s a relief–that rule gets to stay.

iptables and deleting/replacing entries

Whenever I have to reboot my modem [sic] at home, I typically get a new IP address from my ISP.

When that happens, I need to update iptables to allow my new address to connect to the SSH port (port 22) of my jump box (which, fortunately, I have access to from another IP address):

iptables -A INPUT -p tcp -m state --state NEW -s [new IP address] --dport 22 -j ACCEPT

But I don’t want to leave the old entry. How to get rid of it?

The delete (-D) and replace (-R) options require a line number from the chain (e.g. the INPUT chain). To find the line numbers:

iptables -L INPUT --line-numbers

To delete the existing rule and add the new rule:

iptables -D INPUT [line number]
iptables -A INPUT -p tcp -m state --state NEW --dport 22 -s [new IP address] -j ACCEPT

To replace the existing entry:

iptables -R INPUT [line number] -p tcp -m state --state NEW --dport 22 -s [new IP address] -j ACCEPT

Save the updates so they are persistent:

iptables-save > /etc/iptables/rules.v4

(That’s the location for Debian and Ubuntu. This may be different for your distribution.)

3 ways to iterate over lines of a file in Linux

Frequently I need to run a process for each item in a list, stored in a text file one item per line: usernames, filenames, e-mail addresses, etc. Obviously there are more than 3 ways to do this, but here are 3 I have found useful:

Bash
sh prog1.sh list.txt

Source: prog1.sh

while read line
do
    echo $line
done < $1

4 lines. Not bad.

Perl
perl prog2.pl list.txt

Source: prog2.pl

while(<>) {
    print `echo $_`;
}

3 lines. Pretty good.

Perl -n
perl -n prog3.pl list.txt

Source: prog3.pl

print `echo $_`;

1 line! The -n switch basically wraps your Perl code in a loop that processes each line of the input file. I just discovered this while flipping through my 17-year-old copy of Programming Perl (link is to a newer edition).

I really like this method because you can write a script that processes a single input that could easily be reused by another script, but can also easily be used to process an entire list by adding just the -n switch. (There’s also a similar -p switch that does the same thing, but additionally prints out each line.)

I should note that in the examples above, I am using echo as a substitute for any command external to the script itself. In the Perl examples, there would be no need to call echo to merely print the contents of the line, but it’s a convenient stand-in for a generic command.

As suggested by a comment on a previous post, I have made these examples available in a git repository: iterate over lines.

Removing exceptions from a list using Bash (with sed and awk)

  • I have a CSV file, a list of 1000+ users and user properties.
  • I have a list of exceptions (users to be excluded from processing), one user per line, about 50 total.

How can I remove the exceptions from the list?

# make a copy of the original list
cp list-of-1000.csv list-of-1000-less-exceptions.csv
# loop through each line in exceptions.txt and remove matching lines from the copy
while read line; do sed -i "/${line}/d" list-of-1000-less-exceptions.csv; done < exceptions.txt

This is a little simplistic and could be a problem if any usernames are subsets of other usernames. (For example, if user ‘bob’ is on the list of exceptions, but the list of users also contains ‘bobb’, both would be deleted.)

In the particular instance I am dealing with, the username is conveniently the first field in the CSV file. This allows me to match the start of the line and the comma following the username:

while read line; do sed -i "/^${line},/d" list-of-1000-less-exceptions.csv; done < exceptions.txt

What if the username was the third field in the CSV instead of the first?

Use awk:
awk -F, -vOFS=, '{print $3,$0}' list-of-exceptions.csv > copy-of-list-of-exceptions.csv

  • -F, sets the field separator to a comma (defaults to whitespace)
  • -vOFS=, sets the Output Field Separator (OFS) to a comma (defaults to a space)
  • $3 prints the third field
  • $0 prints all the fields, with the specified field separator between them

while read line; do sed -i "/^${line},/d" copy-of-1000-less-exceptions.csv; done < exceptions.txt

Now there’s still an extra username in this file. Maybe that doesn’t matter, but maybe it does. There are several ways to remove it–here’s one:

awk -F, -vOFS=, '$1=""; print $0' copy-of-1000-less-exceptions.csv | sed 's/^,//' > list-of-1000-less-exceptions.csv

  • -F, sets the field separator to a comma (defaults to whitespace)
  • -vOFS=, sets the Output Field Separator (OFS) to a comma (defaults to a space)
  • $1="" sets the first field to an empty string
  • print $0 prints all the fields

The result of the awk command has an initial comma on each line. The first field is still there, it’s just set to an empty string. I used sed to remove it.

You could also use sed alone to remove the extra username field:
sed -i 's/^[^,]*,//' copy-of-1000-less-exceptions.csv

PowerShell Ellipsis (dot dot dot)

Sometimes when you retrieve an object via PowerShell, some of the properties are truncated, denoted by an ellipsis (“…”).

For example:
Get-Mailbox chris | Select AddressListMembership

AddressListMembership
---------------------
{\Staff Global Address List, \Staff, \IT Staff, \Exchange Admins...}

How do you see the full list? There are a couple ways:

Select -ExpandProperty
Get-Mailbox chris | Select -ExpandProperty AddressListMembership

$FormatEnumerationLimit =-1
This is a per-session variable in PowerShell. By default the value is 4, but if you change it to -1 it will enumerate all items. This will affect every property of every object, so it may be more than you need.

MySQL date_add and date_sub functions running against millions of rows

One of my servers runs a query once a week to remove all rows from a Syslog table (>20,000,000 rows) in a MySQL database that are older than 60 days. This was running terribly slowly and interfering with other tasks on the server.

Although the original query used a DELETE statement I’ve used SELECT statements in the examples below.

SELECT COUNT(*)
FROM SystemEvents
WHERE ReceivedAt < DATE_SUB(NOW(), INTERVAL 60 DAY);

That selects about 900,000 rows and takes about 45 seconds.

SELECT COUNT(*)
FROM SystemEvents
WHERE ReceivedAt < DATE_ADD(CURRENT_DATE, INTERVAL -60 DAY);

Likewise takes about 48 seconds.

Is MySQL running a function every time it makes a comparison? I decided to try using a hard-coded date to find out:

SELECT COUNT(*)
FROM SystemEvents
WHERE ReceivedAt < '2015-11-12 12:00:00';

6 seconds! Much faster.

I created a user-defined variable:
SET @sixty_days_ago = DATE_SUB(NOW(), INTERVAL 60 DAY);

Then ran the query:
SELECT COUNT(*)
FROM SystemEvents
WHERE ReceivedAt < @sixty_days_ago;

12 seconds. No 6 seconds, but still a fraction of the original time!

Holding messages in the Postfix mail queue

Earlier today, someone sent a large number of email messages each containing a 30 megabyte attachment to users on our servers. This put our Postfix servers under a heavy load and caused some messages to be delivered after a substantial delay. (This was in part due to additional processing done by our servers, I’m sure a plain-jane Postfix instance could have handled it without an issue.)

This was no good. The sender–let’s call it bigbulk.test.com–should be able to send such messages, but not at the expense of normal mail delivery. I needed to change the priority of those messages to let other messages take priority.

The first thing I did was to hold all the mail from bigbulk.test.com:

  • Retrieve the mail queue
  • Select only the lines containing bigbulk.test.com
  • Select only the queue ID, the first item listed in each result
  • Pass the queue IDs to the postsuper -h command

mailq | grep bigbulk.test.com | cut -d ' ' -f 1 | xargs -n1 postsuper -h

But what about delivering them? I sent them in small batches so as not to overload the server again.

  • Retrieve the mail queue
  • Select only the lines containing bigbulk.test.com
  • Select only the queue ID (stripping out the hold-indicator)
  • Select only the first 5 results
  • Pass the queue IDs to the postsuper -H command

mailq | grep bigbulk.test.com | cut -d '!' -f 1 | head -n5 | xargs -n1 postsuper -H