What is wget command?
wget command is a popular Unix/Linux command-line utility for fetching the content from the web. It is free to use and provides a non-interactive way to download files from the web. The wget command supports HTTPS, HTTP, and FTP protocols out of the box. Moreover, you can also use HTTP proxies with it.
How does wget help you troubleshoot?
There are many ways. As a sysadmin, most of the time, you’ll be working on a terminal, and when troubleshooting web application related issues, you may not want to check the entire page but just the connectivity. Or, you want to verify intranet websites. Or, you want to download a certain page to verify the content. wget is non-interactive, which means that you can run it in the background even when you are logged off. There can be many instances where it is essential for you to disconnect from the system even when doing file retrieval from the web. In the background, the wget will run and finish their assigned job. It can also be used to get the entire website on your local machines. It can follow links in XHTML and HTML pages to create a local version. To do so, it has to download the page recursively. This is very useful as you can use it to download important pages or sites for offline viewing. Let’s see them in action. The syntax of the wget is as below.
Download a webpage
Let’s try to download a page. Ex: github.com If connectivity is fine, then it will download the homepage and show the output as below.
Download multiple files
Handy when you have to download multiple files at once. This can give you an idea about automating files download through some scripts. Let’s try to download Python 3.8.1 and 3.5.1 files. So, as you can guess, the syntax is as below. You just have to ensure giving space between URLs.
Limit download speed
It would be useful when you want to check how much time your file takes to download at different bandwidth. Using the –limit-rate option, you can limit the download speed. Here is the output of downloading the Nodejs file. It took 0.05 seconds to download 13.92 MB files. Now, let’s try to limit the speed to 500K. Reducing the bandwidth took longer to download – 28 seconds. Imagine, your users are complaining about slow download, and you know their network bandwidth is low. You can quickly try –limit-rate to simulate the issue.
Download in the background
Downloading large files can take the time or the above example where you want to set the rate limit as well. This is expected, but what if you don’t want to stare at your terminal? Well, you can use -b argument to start the wget in the background.
Ignore Certificate Error
This is handy when you need to check intranet web applications that don’t have the proper certificate. By default, wget will throw an error when a certificate is not valid. The above example is for the URL where cert is expired. As you can see it has suggested using –no-check-certificate which will ignore any cert validation. Cool, isn’t it?
HTTP Response Header
See the HTTP response header of a given site on the terminal. Using -S will print the header, as you can see below for Coursera.
Manipulate the User-Agent
There might be a situation where you want to connect a site using a custom user-agent. Or specific browser’s user-agent. This is doable by specifying –user-agent. The below example is for the user agent as MyCustomUserAgent.
Host Header
When an application is still in development, you may not have a proper URL to test it. Or, you may want to test an individual HTTP instance using IP, but you need to supply the host header for application to work properly. In this situation, –header would be useful. Let’s take an example of testing http://10.10.10.1 with host header as application.com Not just host, but you can inject any header you like.
Connect using Proxy
If you are working on a DMZ environment, you may not have access to Internet sites. But you can take advantage of proxy to connect. Don’t forget to update $PROXYHOST:PORT variable with the actual ones.
Connect using a specific TLS protocol
Usually, I would recommend using OpenSSL to test the TLS protocol. But, you can use wget too. wget –secure-protocol=TLSv1_2 https://example.com The above will force wget to connect over TLS 1.2.
Conclusion
Knowing the necessary command can help you at work. I hope the above gives you an idea of what you can do with wget.