I used nodejs to climb more than 10,000 lady and sister wallpapers

"This article is participating in the technical topic call for Node.js advanced road, click to view details "

foreword

Hello, everyone, I am Xiaoma, why do I need to download so many pictures? A few days ago, I deployed a wallpaper applet for free using uni-app + uniCloud. Then I need some resources to fill the applet with content.

Crawl pictures

First initialize the project, and install axiosandcheerio

npm init -y && npm i axios cheerio
复制代码

axiosUsed to crawl web content, it cheeriois the jquery api on the server side, we use it to get the image address in the dom;

const axios = require('axios')
const cheerio = require('cheerio')

function getImageUrl(target_url, containerEelment) {
  let result_list = []
  const res = await axios.get(target_url)
  const html = res.data
  const $ = cheerio.load(html)
  const result_list = []
  $(containerEelment).each((element) => {
    result_list.push($(element).find('img').attr('src'))
  })
  return result_list
}
复制代码

In this way, you can get the image url in the page. Next, you need to download the image according to the url.

How to download files using nodejs

Method 1: Use built-in modules 'https' and 'fs'

Downloading files with node js can be done using built-in packages or third-party libraries.

The GET method is used with HTTPS to get the file to download. createWriteStream()is a method for creating a writable stream that takes only one parameter, the location where the file is saved. Pipe()is a method that reads data from a readable stream and writes it to a writable stream.

const fs = require('fs')
const https = require('https')

// URL of the image
const url = 'GFG.jpeg'

https.get(url, (res) => {
  // Image will be stored at this path
  const path = `${__dirname}/files/img.jpeg`
  const filePath = fs.createWriteStream(path)
  res.pipe(filePath)
  filePath.on('finish', () => {
    filePath.close()
    console.log('Download Completed')
  })
})
复制代码

Method 2: DownloadHelper

npm install node-downloader-helper
复制代码

Below is the code to download the image from the website. An object dl is created by the class DownloadHelper, which receives two parameters:

  1. The image to be downloaded.
  2. The path where the image must be saved after downloading.

The File variable contains the URL of the image that will be downloaded, and the filePath variable contains the path where the file will be saved.

const { DownloaderHelper } = require('node-downloader-helper')

// URL of the image
const file = 'GFG.jpeg'
// Path at which image will be downloaded
const filePath = `${__dirname}/files`

const dl = new DownloaderHelper(file, filePath)

dl.on('end', () => console.log('Download Completed'))
dl.start()
复制代码

Method 3: Use download

Written by npm god sindresorhus , very easy to use

npm install download
复制代码

Below is the code to download the image from the website. The download function receives a file and file path.

const download = require('download')

// Url of the image
const file = 'GFG.jpeg'
// Path at which image will get downloaded
const filePath = `${__dirname}/files`

download(file, filePath).then(() => {
  console.log('Download Completed')
})
复制代码

final code

Originally wanted to climb Baidu wallpaper, but the resolution was not enough, and there were also watermarks, etc. Later, a small partner in the group found an api, which is estimated to be a high-definition wallpaper on a mobile APP, and the download url can be directly obtained. , I used it directly.

Below is the complete code

const download = require('download')
const axios = require('axios')

let headers = {
  'User-Agent':
    'Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36',
}

function sleep(time) {
  return new Promise((reslove) => setTimeout(reslove, time))
}

async function load(skip = 0) {
  const data = await axios
    .get(
      'http://service.picasso.adesk.com/v1/vertical/category/4e4d610cdf714d2966000000/vertical',
      {
        headers,
        params: {
          limit: 30, // 每页固定返回30条
          skip: skip,
          first: 0,
          order: 'hot',
        },
      }
    )
    .then((res) => {
      return res.data.res.vertical
    })
    .catch((err) => {
      console.log(err)
    })
  await downloadFile(data)
  await sleep(3000)
  if (skip < 1000) {
    load(skip + 30)
  } else {
    console.log('下载完成')
  }
}

async function downloadFile(data) {
  for (let index = 0; index < data.length; index++) {
    const item = data[index]

    // Path at which image will get downloaded
    const filePath = `${__dirname}/美女`

    await download(item.wp, filePath, {
      filename: item.id + '.jpeg',
      headers,
    }).then(() => {
      console.log(`Download ${item.id} Completed`)
      return
    })
  }
}

load()
复制代码

In the above code, set User-Agentand , which can prevent the server from blocking the crawler and return 403 directly.

node index.jsThe pictures will be downloaded automatically.

Crawling is running

experience

The WeChat applet search "Watermelon Gallery" experience.

finally

The above-mentioned group is the "Ape Creation Camp" led by @大報大傑大廳. There are many developers in the group who can help each other to answer questions and exchange technologies. At the same time, the big handsome will also share about outsourcing, sideline business, etc. I am interested. Friends can leave a message "join the group".

The above is the whole content of this article. I hope this article will be helpful to you. You can also refer to my previous articles or exchange your thoughts and experiences in the comment area. Welcome to explore the front end together.

The first gold mining platform in this article, the source of the pony blog

Guess you like

Origin juejin.im/post/7078206989402112037