linux: wget command

wget is an open source software developed under Linux by Hrvoje Niksic and later ported to various platforms including Windows. It has the following functions and features:
(1) Support breakpoint download function; this is also the biggest selling point of the network ants and FlashGet in that year, now, Wget can also use this function, those users who are not very good in the network can rest assured;
(2) Support both FTP and HTTP download methods; although most software can be downloaded by HTTP, sometimes, it is still necessary to use FTP to download software;
(3) Support proxy server; for systems with high security strength, Generally speaking, you generally do not expose your own system directly to the Internet, so supporting proxy is a must-have function for downloading software;
(4) The setup is convenient and simple; maybe, users who are used to the graphical interface are not too used to the command line, but , the command line actually has more advantages in setting, at least, the mouse can be clicked many times less, and don't worry about whether the mouse is wrong;
(5) The program is small and completely free; Too big; completely free has to be considered, even though there are many so-called free software on the Internet, but the advertisements of these software are not our favorite;

Although wget is powerful, it is relatively simple to use. The basic syntax is: wget [parameter list] URL. The following is a combination of specific examples to illustrate the usage of wget.
1. Download the entire http or ftp site.
wget http://place.your.url/here
This command can download the home page of http://place.your.url/here. Using -x will force the creation of the exact same directory on the server. If the -nd parameter is used, all content downloaded on the server will be added to the local current directory.

The command wget -r http://place.your.url/here
will recursively download all directories and files on the server, essentially downloading the entire website. This command must be used carefully, because when downloading, all the addresses pointed to by the downloading website will also be downloaded. Therefore, if this website references other websites, the referenced website will also be downloaded! For this reason, this parameter is not commonly used. The download level can be specified with the -l number parameter. For example to download only two layers, use -l 2.

If you want to make a mirror site, you can use the -m parameter, for example: wget -m http://place.your.url/here
then wget will automatically determine the appropriate parameters to make a mirror site. At this point, wget will log in to the server, read robots.txt and execute according to the regulations of robots.txt.

2. Resume from a breakpoint.
When the file is very large or the network is very slow, the connection is often cut off before a file is downloaded. The resumable upload of wget is automatic, you only need to use the -c parameter, for example:
wget -c http://the.url.of/incomplete/file
Using resumable upload requires the server to support resumable upload. The -t parameter indicates the number of retries. For example, if you need to retry 100 times, then write -t ​​100. If it is set to -t 0, it means infinite retries until the connection is successful. The -T parameter indicates the timeout waiting time, such as -T 120, which means that it will timeout after waiting for 120 seconds if the connection fails.

3. Batch download.
If there are multiple files to download, you can generate a file, write the URL of each file in one line, for example, generate the file download.txt, and then use the command: wget -i download.txt
to list the download.txt Download each URL of . (If the file is listed, download the file, if the website is listed, then download the home page)

4. Selective download.
You can specify to let wget download only one type of file, or nothing to download. For example:
wget -m –reject=gif http://target.web.site/subdirectory
means download http://target.web.site/subdirectory, but ignore gif files. --accept=LIST acceptable file types, --reject=LIST rejected file types.

5. Password and authentication.
wget can only handle websites that use username/password to restrict access. Two parameters can be used:
–http-user=USER to set HTTP user
–http-passwd=PASS to set HTTP password
For websites that require certificates for authentication, you can only Take advantage of other download tools, such as curl.

6. Use a proxy server to download.
If the user's network needs to go through a proxy server, you can let wget download files through the proxy server. At this point, you need to create a .wgetrc file in the current user's directory. The proxy server can be set in the file:
http-proxy = 111.111.111.111:8080
ftp-proxy = 111.111.111.111:8080
represents the proxy server of http and the proxy server of ftp respectively. If the proxy server requires a password, use:
--proxy-user=USER to set the proxy user
--proxy-passwd=PASS to set the proxy password
.
Use the parameter --proxy=on/off to enable or disable the proxy.
There are many useful functions of wget, which need users to dig.

appendix:

Command format:
wget [parameter list] [target software, website URL]

-V,--version Display the software version number and exit;
-h,--help Display software help information;
-e,--execute=COMMAND Execute a ".wgetrc" command

-o,--output-file=FILE save software output information to a file;
-a,--append-output=FILE append software output information to a file;
-d,--debug display output information;
-q,--quiet do not display Output information;
-i, --input-file=FILE Get URL from file;

-t, --tries=NUMBER whether to download the number of times (0 means infinite)
-O --output-document=FILE save the downloaded file as another file name
-nc, --no-clobber Do not overwrite the existing file
-N, --timestamping Only download files newer than the local one
-T, --timeout=SECONDS set the timeout time
-Y, --proxy=on/off turn off the proxy

-nd, --no-directories do not create directories
-x, --force-directories force directory creation

--http-user=USER set HTTP user --http
-passwd=PASS set HTTP password
--proxy-user=USER set proxy user
--proxy-passwd=PASS set proxy password

-r, --recursive download entire website, directory (use with care)
-l, --level=NUMBER download level

-A, --accept=LIST acceptable file types
-R, --reject=LIST rejected file types
-D, --domains=LIST acceptable domains --exclude
-domains=LIST rejected domains
-L, --relative download Associated links
--follow-ftp download only FTP links
-H, --span-hosts can download external hosts
-I, --include-directories=LIST allowed directories
-X, --exclude-directories=LIST denied directories

Chinese document names are encoded normally, but are normal when --cut-dirs,
wget -r -np -nH --cut-dirs=3 ftp://host/test/test.txt
wget
- r -np -nH -nd ftp://host/test/
%B4%FA%B8%D5.txt
wget “ftp://host/test/*”
%B4%FA%B8%D5.txt

For unknown reasons, probably to avoid special file names, wget will automatically process the part of the captured file name with encode_string, so the patch will process the encoded_string into something like "%3A" and restore it with decode_string form ":" and apply it to the part of the directory and file name, decode_string is a built-in function of wget.

wget -t0 -c -nH -x -np -b -m -P /home/sunny/NOD32view/  http://downloads1.kaspersky-labs.com/bases/ -o wget.log

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326725858&siteId=291194637