Download websites with wget
If you don't have it already, install wget
with your package manager. On macOS
I use brew install wget
.
Then to download my wonderful website using wget
you can use the following command
in your terminal or command prompt:
wget --recursive --no-clobber --page-requisites \
--html-extension --convert-links \
--restrict-file-names=windows --domains calmcode.co \
--no-parent https://calmcode.co/
Let me break down the options used in this command:
--recursive
: Download the entire site.--no-clobber
: Skip downloading files that already exist.--page-requisites
: Download all the elements needed to properly display the page (images, stylesheets, etc.).--html-extension
: Save HTML files with the.html
extension.--convert-links
: Convert links to make them suitable for offline viewing.--restrict-file-names=windows
: Modify filenames to work on Windows.--domains calmcode.co
: Limit downloads to the specified domain.--no-parent
: Do not ascend to the parent directory.
That takes about 5 seconds to download the entire site.
Now you can open up the topmost index.html
to read this site completely offline.
The dependant images did download but they didn't show up on the pages. I'm
guessing it's something to do with the <picture>
tag.
While looking for a fix I found there is a wget2
in the making! I used Homebrew again brew install wget2
, deleted the
downloaded files and ran it with the same options;
wget2 --recursive --no-clobber --page-requisites \
--html-extension --convert-links \
--restrict-file-names=windows --domains calmcode.co \
--no-parent https://calmcode.co/
Wow wget2
was fast. It took about 0.5 seconds and has a nicer
output. It also fixed the <picture>
image srcset
issue I had with wget
.
Disclaimer: Not all webmasters appreciate you crawl their sites for offline use. Please check website terms before going to town with this approach.
- Previous: DaisyDisk for macOS
- Next: Personal Firewall