Let's learn how to use Spynner for data scraping
As the name suggests, a single data grabber is a form of data grabbing in a single process, which is usually used when there is not a lot of data.
First we want to introduce Spynner
When Spynner is crawling, a browser can appear, and the changes in the crawling process can be observed in the browser. Spynner can also load dynamic content generated by javascript, and then crawl this part of the dynamic content
Other functions of Spynner are basically the same as those of the requests module, except that the above functions are unique to spynner
The official open source address of Spynner https://github.com/makinacorpus/spynner
install spynner
It should be noted here that Spynner currently only supports the python2 version. In order to adapt to the changes, we temporarily switch to the python2 environment
Use the command to install, as shown below
Or you can also use pycharm to install
Search and install
In fact, it's not that simple, I have a lot of problems with the installation, and it can't be installed properly every time
No result whatsoever
Then choose to view the official documentation, try step by step, and finally install it
First of all, my computer is installed with both Python2.7 and Python3.6. This requires great attention to pip
He decides which environment your module is installed in
By comparing the version numbers, it is found that only pip2 is for python2.7
So our next installations are all using pip2
First we have to install the Sip module
Sip SIP is a Python tool for automatically generating Python bindings to C and C++ libraries. SIP was originally developed with PyQt in 1998 for Python bindings to the Qt GUI toolkit, but is suitable for generating bindings for any C or C++ library.
Sip's official website https://www.riverbankcomputing.com/software/sip/intro
Continue the following installation steps only after installing Sip
After installation, we will start to install PyQt, because Spynner needs to simulate the function of the browser, so we need to install PyQt
There is a very powerful webkit in PyQt, and Spynner has the ability to execute javascript, and Spynner is based on PyQt, so PyQt must be installed
Through continuous attempts, use pip2 to install PyQt, but every time it prompts that this module is not found
At this time, through the search engine, I found that these libraries do not support Python2. At this time, I was really speechless.
Finally saw an article, the package name of Python2's Qt is called python-qt, as if the dawn reappears
Finally, use this command to install the Pyqt support library that supports python2
But I have to complain, this installation is too slow,
After a long wait
successfully installed
Did you think it was over, of course not
Then I am going to install spynner
Reported a lot of mistakes, of course the installation failed
continue to solve the problem
Through the search, it is found that the libxml2 and libxslt libraries are not installed, just install them.
For convenience, we need to install the above two libraries through brew
Before installation, we need to install brew on Mac first, run the command to successfully install brew
ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
Installation is still slow
The final installation was successful
Then we install the required dependencies through brew
brew install libxml2 brew install libxslt brew link libxml2 --force brew link libxslt --force
After the above steps, install the following dependency file with pip
pip2 install lxml
successfully installed
Then we run the command again to install spynner
Note that you need to run the above command with root privileges
last error
Continue to solve the problem, through searching, it is found that a dependent library libffi needs to be installed
run the command
brew install libffi
After the installation, we continue to try the installation command
It's still a failure as always, it's really a mental breakdown at this time
Articles are continuously updated