[Linux] The difference between wget and curl

Preface

wgetFor those who are new to Linux commands, they often encounter using and   in different scenarios curl, so they have always wanted to figure out the difference between the two. I saw an article on the Internet that summarized it very comprehensively. I will reprint it here. Infringement and deletion .

wget command

  The wget command is used to download files from the specified URL. wget is very stable, and it has strong adaptability in situations with very narrow bandwidth and unstable networks. If the download fails due to network reasons, wget will continue to try until the entire file is downloaded. If the server interrupts the download process, it will contact the server again and continue downloading from where it stopped. This is useful for downloading large files from servers that have limited connection times.

Command format:wget [选项] URL资源

1. Download a single file

  • wget http://www.example.com/testfile.zip

The downloaded file is saved in the current directory. During the download process, a progress bar will be displayed, including (download completion percentage, downloaded bytes, current download speed, remaining download time)

2. Download and save with a different file name

  • wget -O myfile.zip http://www.example.com/testfile.zip

    • -O 自定义文件名:Rename the downloaded file

If you do not specify the "-O" option, wget will use all the characters after the last " /" in the URL path as the downloaded file name by default. For example wget http://www.example.com/testfile?id=123, the downloaded file name will betestfile?id=123

3. Resume download from breakpoint

  • wget -c http://www.example.com/testfile.zip

    • -c: Continue to execute the task that was not downloaded last time.

When the downloaded file is particularly large or the connection is disconnected before the file is downloaded due to network reasons, you can use -cthe option to continue downloading from the last download task when the network connection is restored without having to start downloading the file again.

By default, wget retries 20 times to connect and download files. If there is a problem with the network, the download may fail. --triesYou can increase the number of retries if needed . For example, set up to 40 retries:wget --tries=40 http://www.example.com/testfile.zip

4. Background download

  • wget -b http://www.example.com/testfile.zip

    • -b: Download in background mode

When downloading very large files cannot be completed in time, background downloading can be performed. During background download, a "wget-log" file will be created in the current download directory to record the download log. You can use tail -f wget-logthe command to check the download progress.

5. Bandwidth control and download quotas

  • wget --limit-rate=下载速度 http://www.example.com/testfile.zip

    • --limit-rate=下载速度: Limit the download speed not to exceed the specified one. For example: --limit-rate=300k

When you execute wget, it will occupy all possible bandwidth downloads by default, but when you are going to download a large file and you also need to download other files, it is necessary to limit the speed.

If you also need to limit the download quota, you can use the option " -Q 下载配额". If the download data exceeds the specified quota, the download will stop. Note that this option is not valid for single file downloads, only for multi-file downloads or recursive downloads , for example: wget -Q 10m -i dowload.txtIf no download quota is specified, all URLs contained in the download.txt file will be downloaded. If the download quota is specified as 10m, the download data exceeds After 10m, the download of the following URL will be stopped (if a file is being downloaded and the download data exceeds the download quota, the download of the file will continue to be completed and the download will not be stopped immediately).

6. Multiple file downloads

  • wget -i url文件

    • -i url文件: Get the URL address to download from the specified file

If there are multiple URL resources that need to be downloaded, you can first generate a file, write the URL of the download address into the file line by line, and then use the "-i" option to specify the file to download in batches.

7. Password authentication download

  • wget --http-user=USER --http-password=PASS http://www.example.com/testfile.zip

    • --http-user=USER: Set the http user name to USER
    • --http-password=PASS: Set http password to PASS
    • --ftp-user=USER: Set the ftp user name to USER
    • --ftp-password=PASS: Set the ftp password to PASS

For websites that require certificates for authentication, you can only use other download tools, such as curl

8. Recursive download

  • wget -r http://www.example.com/path1/path2/

    • -r: Recursively down the entire site (www.example.com) resources
    • -nd: When recursively downloading, do not create layers of directories, and download all files to the current directory; if this option is not specified, the corresponding directory will be created at the site location according to the resource by default.
    • -np: When downloading recursively, the upper directory is not searched, and the download is only performed under the current path path2; if this option is not specified, the entire site will be searched by default.
    • -A 后缀名: Specify the suffix name of the file to be downloaded. Use commas to separate multiple suffix names.
    • -R 后缀名: Exclude the suffix of the file to be downloaded. Use commas to separate multiple suffixes.
    • -L: Do not enter other hosts during recursion. If this option is not specified, if the site contains links to external sites, this may cause the download content to be infinitely large.

Example, only download all pdf and png files in path2, and save them all in the current download directory without creating additional directories:
wget -r -nd -np -A pdf,png http://www.example.com/path1/path2/

curl command

The curl command is a file transfer tool that uses URL rules to work under the command line. It supports file upload and download, so it is a comprehensive transmission tool. However, according to tradition, curl is called a download tool. As a powerful tool, curl supports many protocols including HTTP, HTTPS, ftp, etc. It also supports features such as POST, cookies, authentication, downloading partial files from specified offsets, user agent strings, speed limits, file sizes, progress bars, etc. . To automate web page processing and data retrieval, curl can help.

Download functionality similar to wget

1. Single file download

  • curl [-o 自定义文件名|-O] http://www.example.com/index.html

    • -o 自定义文件名: Output the server response to the specified file
    • -O: The -osame as the option, the difference is that the part after the last "/" of the url path is used as the file name

If neither option is written, curl will output the server response content to the terminal by default.

2. Resume download from breakpoint

  • curl -O -C 偏移量 http://www.example.com/testfile.zip

    • -C 偏移量: Continue downloading from the specified offset, the offset is in bytes

If you want curl to automatically infer the correct resumption location, you can use " -" instead of the offset, for example:
curl -O -C - http://www.example.com/testfile.zip

3. Bandwidth control and download quotas

  • curl -O --limit-rate 下载速度 http://www.example.com/testfile.zip

    • --limit-rate 下载速度: Limit the download speed not to exceed the specified one. Example: –limit-rate 500k
    • --max-filesize 下载配额:Specify the maximum downloadable file size

Handle complex web requests

1. Automatic jump

  • curl -L http://www.example.com

    • -L: Automatically jump to the redirect link (Location)

Some links will automatically jump when accessed (the response status code is 3xx), and -Lthe parameters will cause the HTTP request to follow the server's redirection. For example: accessing "http://a.com" will redirect to "http://b.com", using the "-L" option will return the response content of "http://b.com"

2. Display response header information

  • curl -i http://www.example.com

    • -i:The output contains response header information
    • -I: The output only contains response header information and does not contain response content.

3. Display the communication process

  • curl -v http://www.example.com

    • -v: Display the entire process of an http communication, including port connection and http request header information

If you need to view additional communication information, you can also use the option " --trace 输出文件" or " --trace-ascii 输出文件", for example: curl --trace-ascii output.txt http://www.example.com, open the file "output.txt" to view the results.

4. Specify http request method

  • curl -X 请求方式 http://www.example.com/test

    • -X 请求方式: Specify the http request method (GET|POST|DELETE|PUT, etc.). Default is "GET"

5. Add http request header

  • curl -H 'kev:value' http://www.example.com/test

    • -H 'kev:value': Add http request header. example:-H 'Content-Type:application/json'

Add multiple request headers and -Hrepeat the options multiple times. For example:
curl -H 'Accept-Language: en-US' -H 'Secret-Message: xyzzy' http://www.example.com/test

6. Pass request parameters

  • curl -X POST -d '参数' http://www.example.com/test

    • -d '参数': Specify the POST request body. The parameter form can be "k1=v1&k2=v2", or it can be a json string
    • --data-urlencode '参数': -dSame as , the difference is that the sent data will be automatically URL-encoded

After using -dthe parameter, the HTTP request will automatically add the header " Content-Type:application/x-www-form-urlencoded", and the request will be automatically converted to the POST method, so " -X POST" can be omitted. If the request body to be sent is a json string, you need to specify "Content-Type: application/json", for example:
curl -d '{"user":"zhangsan", "password":"123456"}' -H 'Content-Type:application/json' http://www.example.com/login

When there are many parameters, you can save the parameter data to local text and then read the parameter data from the text. For example:
curl -d '@requestData.txt' -H 'Content-Type:application/json' http://www.example.com/login

If you want to send form data in a GET request, you can directly append the parameters to the url. For example:
curl http://www.example.com/login?user=zhansan&password=123456

7. File upload

  • curl -F 'file=@文件' https://www.example.com/test

    • -F 'file=@文件': Simulate http form to upload files to the server. More parameter forms:file=@文件;name1=value1;name2=value2

When uploading files, -Fthe option will be added to the HTTP request header by default Content-Type: multipart/form-data. The default file MIME type isapplication/octet-stream

Specify the upload file MIME type . The following example specifies the MIME type as "image/png"
curl -F '[email protected];type=image/png' https://google.com/profile

Specify the upload file name . In the example below, the original file name is "photo.png", but the file name received by the server is "me.png"
curl -F '[email protected];filename=me.png' https://google.com/profile

8. Set source URL

  • curl -e '源网址' https://www.example.com

    • -e '源网址'Or --referer '源网址': Set the source URL, which is Refererthe field of the http request header. -HEquivalent to directly setting the request header "Referer" field with the option

9. Set client user agent

  • curl -A '代理信息' https://www.example.com

    • -A '代理信息'Or --user-agent '代理信息': Set the client user agent, which is User-Agentthe field of the http request header. Equivalent to -Hthe option to directly set the request header "User-Agent" field

Change "User-Agent" to Chrome browser, example:
curl -A 'Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.149 Mobile Safari/537.36' https://www.example.com

Remove the "User-Agent" request header, example:
curl -A '' https://www.example.com

10. Set cookies

  • curl -b '参数' https://www.example.com

    • -b '参数'Or --cookie '参数': Set cookie parameters. The parameter form can be key1=value1;key2=value2..., or it can be a file
    • -c 文件: Write the cookie information responded by the server to the file

As for the specific cookie value, it can be obtained from the "Set-Cookie" field of the http response header. You can save the cookie information returned by the server to a file, and then use this file as the cookie information for the next request, as follows:
curl -c cookies.txt http://example.com
curl -b cookies.txt http://example.com

11. Set the username and password for server authentication

  • curl -u 'user[:password]' https://www.example.com

    • -u 'user[:password]':Set the username and password for server authentication. When there is only a username, you will be prompted to enter a password after executing curl.

Comparison between wget and curl

wget is an independent download program that does not require additional resource libraries. It also allows you to download any content in web pages or FTP directories. You can enjoy its extraordinary download speed, simple and direct.
Curl is a multifunctional tool supported by the libcurl library. It can download web content, but it can also do much more.

In terms of usage, wget tends to download network files; curl tends to debug network interfaces, which is equivalent to a PostMan tool without a graphical interface.


Reference article:
http://www.ruanyifeng.com/blog/2011/09/curl.html

Guess you like

Origin blog.csdn.net/ZHOU_YONG915/article/details/133752196