google-images-download batch download limit solution

 google-images-download is used to download images in batches

google-images-download is a Python script. A single command completes the Google image search and batch download functions. Moreover, this tool also runs cross-platform, supported by Linux, Windows and macOS. It is simply a gospel for lazy people.

First of all, we first specify the location of the image to be downloaded, I assigned it to the "Download" folder:

cd ~/Downloads

Then, execute it in the terminal:

googleimagesdownload -k "谭卓" -l 20

In this line of code:

  • googleimagesdownload is the name of the command, telling the system what command we want to execute now, and what we want to execute now is the command "googleimagesdownload".
  • -k refers to "Keyword", so it is followed by a keyword, in this case "Tan Zhuo". Note that the keyword should be enclosed in straight double quotation marks.
  • -l refers to the "limit", which specifies the number of downloaded pictures. In this example, we downloaded 20 photos.

The last Error: 1 indicates that an error occurred during the download process. But the program still finishes the download process normally.

We found that the downloaded pictures have been stored under ~/Downloads/downloads/ Tan Zhuo. google-images-download is very considerate, creating subdirectories for us.

Basically, this line of command can help us solve the need to download pictures in batches under normal circumstances.

 

However, in some cases, we need to download far more than 20 pictures. For example, I looked at the photos for a long time, but I still couldn't distinguish Hao Lei and Tan Zhuo clearly. So in order to completely distinguish the two actresses, I plan to download another 200 photos of Hao Lei.

Following the command just now, execute:

googleimagesdownload -k "郝蕾" -l 200

Then, you will find an error:

Don't panic if you encounter problems. You have to look carefully at the error message. Note that there is a keyword: chromedriver. What is this thing?

We return to the github page of google-images-download and search with chromedriver as the key word. You will immediately find the following results:

It turns out that when the number of pictures we download exceeds 100, the program must call Selenium and chromedriver. It doesn't matter if they are both, we just need to install it.

Selenium was installed at the same time when we installed google-images-download. Now we only need to download chromedriver

 

Then we can download more than 100 pictures in batches. Execute the following commands:

googleimagesdownload -k "郝蕾" -l 200 --chromedriver="./chromedriver"

We will find one more parameter-chromedriver. It is used to tell google-images-download where chromedriver is located after decompression. This time the machine worked hard and helped us download Hao Lei’s photos

 

After the download was completed, some errors were reported, and some pictures were not downloaded correctly. But this did not have much impact on the overall result. To be safe, it is recommended that you set more when setting the download quantity. Leave yourself a margin of safety.

 

Operating parameters

I counted it, there are 39 items in total. Due to space limitations, I will not list them all here. But for some of the characteristic parameters, I still want to remind you, because you are likely to find them useful in your actual work.

  • --format: select the image format, such as jpg, png, gif and svg, etc.;
  • --usage_rights: Select image copyright, such as labeled-for-nocommercial-reuse, etc. If you want to build a picture material library for your own content, you can use this option to avoid stepping on the copyright pit and being asked for money by lions;
  • --size: select the picture size. If you have requirements for the picture resolution, you can use >10MP to download only those pictures with a pixel number exceeding 10M;
  • --type: select the picture type. For example, if you only want a photo, you can use photo, and if you only want an animation image, you can use animated;
  • --time: Select the time when the picture is retrieved. If you want pictures of the past week, you can use past-7-days;
  • --specific_site: Specify the image storage website. You can limit the search results to a certain website domain name;

The last parameter is ---safe_search, which is used to enable safe search to ensure that there will be no content that is not conducive to the construction of spiritual civilization in the search results.

Guess you like

Origin blog.csdn.net/wi162yyxq/article/details/103567252