Converting a WordPress site to a static site using Wget

I recently made a YouTube tutorial on converting a WordPress site to a static HTML site. This blog post is a companion to the video.

First of all, why convert a WordPress site to a static HTML site? There are a number of reasons, but my primary concern is to reduce update fatigue. WordPress software, along with WordPress themes and plugins, have frequent security updates. Many sites have stable content after an initial editing phase, the need to apply never-ending security updates for a site that doesn’t change doesn’t make sense.

The example site I used in the tutorial is www.stress2012.com, a site for an academic conference/workshop that was held in 2012. It’s 2024: the site content is not going to change.

To mirror the site, I used Wget with the following command:

wget -k -r http://www.stress2012.com

That created a www.stress2012.com directory in my current working directory. That directory contained all of the HTML files Wget found on the site, as well as all the non-HTML assets (images, stylesheets, Javascript files, etc.).

The document root for the WordPress site was set up at /var/www/stress2012. I copied the site mirror to an adjacent location:

sudo cp -r www.stress2012.com /var/www/www.stress2012.com

The permissions were off after copying using sudo privileges, so I updated those:

sudo chown -R chris:www-data /var/www/www.stress2012.com

Then I copied the Apache web server config file for the WordPress site, /etc/apache2/sites-available/stress2012, to an adjacent location:

sudo cp /etc/apache2/sites-available/stress2012 /etc/apache2/sites-available/stress2012-static

The most critical change in that file was to update the document root to /var/www/www.stress2012.com.

After that change was made, I disabled the WordPress site, enabled the static site, and reloaded the Apache web server:

sudo a2dissite stress2012.conf
sudo a2ensite stress2012-static.conf
sudo systemctl reload apache2

That’s it! The WordPress site has now been converted to a static site.

The WordPress site included a search form, which relied on PHP and MySQL. The HTML is faithfully reproduced on the static site, but of course it is not functional. I examined the search form and found that the first line of the form contained the presumably unique string “searchform” followed by 4 additional lines.

I removed the search form from all the HTML pages on the static site using the following command:

cd /var/www/www.stress2012.com
find -name *.html -exec sed -i '/searchform/,+4d' '{}' \;

That latter command is not something you’d want to run on a production web site unless you’re feeling really, really confident. In this case, I knew I could revert to the WordPress site if anything went awry.

That’s one of the great things about this particular method of converting a WordPress site to a static site: it’s easy to re-enable the WordPress site and disable the static HTML site if needed.

Other reasons you might want to use a static HTML site aside from avoiding update fatigue?

3 thoughts on “Converting a WordPress site to a static site using Wget”

  1. It turns out, not everything was as seamless as it first appeared: some of the files were being loaded from my browser cache.

    Wget saved files like jquery.min.js?ver=3.7.1 exactly as they appeared. The filename is jquery.min.js?ver=3.7.1 on disk.

    That makes sense, but when the web browser receives a request for jquery.min.js?ver=3.7.1, it looks for jquery.min.js on disk. That file doesn’t exist.

    I used the following to rename the files:

    find -name '*\?*' | \
    perl -lne '($old=$_) && s/\?.+$// && rename($old,$_)'
  2. In the original post (and the video) I had used the following wget command:

    wget -r http://www.stress2012.com

    I have revised the article to additionally use the -k option to convert links:

    wget -k -r http://www.stress2012.com

    This should drop the protocol and fully-qualified domain name (FQDN) from the URL and reference local paths relative to the document root.

Leave a Reply

Your email address will not be published. Required fields are marked *