python reptile popular libraries

Request Libraries:

1. requests this library is a library of the most common reptiles

2. Selenium Selenium is an automated testing tool that we can use it drive the browser perform specific actions, such as clicking, pull down other operations for some of the pages to do with JS Yi dyed, this is a very effective way to grab.

3.ChomeDrive install this library to drive a Chrome browser, complete the appropriate action

4.GeckoDriver use W3C WebDriver compatible client interacts with the proxy Gecko-based browsers.

5.PhantomJS PhantomJS is a no interface, scripting WebKit browser engine, which supports a variety of native Web standards: Dom operations, css selector, json, Canvas and SVG.

Before receiving requests 6.aiohttp library is a library blocking HTTP request, when we send a request. The program has been waiting for the server until after the server responds, the program will most further processing. In fact, this process is time-consuming. If the program can do some other things in the process of waiting, such as scheduling requests, treatment response, then the efficiency of the reptiles will be a great improvement over the previous kind of way. And aiohttp is such a library to provide asynchronous web services. Use this library says it is still quite easy to use.

Parsing library:

1.lxml lxml is a python parsing library that supports HTML and xml parsing, and XPath support analytical methods, and the efficiency is very high, the majority of programmers love

2.Beautiful Soup Beautiful Soup is a python in HTML or XMl parsing library that you can easily understand the page to extract data with powerful API and a variety of analytical methods.

3.pyquery is also a powerful web analysis tool that provides a similar syntax and jQuery to parse HTML text tip,

 

database:

1.mysql database

2.MongoDB Mo goDB ++ language is written by non-relational database, it is similar to a JSON object based on the open source database system content is stored in the form of distributed file storage, and its field value can contain other documents, arrays and array of documents, very flexible

3.Redis is stored based on a highly efficient non-relational databases,

 

Repository:

1.PyMySOL

2.PyMongo

3.redis-py

4.RedisDump

 

web library:

1.Flask is a lightweight Web server program, it is simple, easy to use, flexible

2.Tornado asynchronous Web is a support frame, by using a non-blocking I / O streams, can support thousands of open connection.

APP crawling associated libraries:

1.Charles is a network packet capture tool, compared to Fiddler, its more powerful and better cross-platform support.

2.mitmproxy is a support HTTP and HTTPS packet capture tool, like Fiddler, Charles function, but it operates in the form of the console.

3.Android

Guess you like

Origin www.cnblogs.com/qingdeng123/p/11329646.html