cheerio third-party modules: cheerio is a fast, flexible and simple to achieve jquery core functions, primarily for use where the server needs to operate DOM
Http module to do with reptiles crawling news page:
the require HTTP = const ( "HTTP" ); const FS = the require ( "FS" ); const Cheerio = the require ( "Cheerio" ); http.get ( "http://news.baidu.com", (RES) = > { var STR = "" ; res.setEncoding ( "UTF-. 8"); // preventing garbled text res.on ( "Data", (the thunk) => { STR + = the thunk; }) res.on ( "End", () => { var $ = cheerio.load (STR); // . crawling entire document becomes operable jQuery object, you can use the method jQuery fs.writeFileSync ( "./ news.txt ","","utf-8"); $(".focuslistnews a").each((index,item)=>{ fs.appendFileSync("./news.txt",$(item).text()+"\r\n","utf-8"); }) }) })
Questions added: path.join and path.resolve the difference?
path.join () method is to incorporate a plurality of path string argument string into the console.log (path.join (__ dirname, 'A', 'B')); the path if the current file is E: / node / 1, then stitching it is E: / node / 1 / a / b.
path.resolve () method can be resolved to a plurality of paths normalized absolute path path.resolve () method is a procedure for the root directory, as a starting point, an absolute path based on the parameters parsed
path.extname obtain an extension of the current file
__Dirname absolute path to the current directory
__Filename absolute path of the current file