Greek race soft exam questions share a web crawling process

As a developer, for whatever reason, there is a demand only function, we have the opportunity, or depicting the blueprint of power

Part undertake [python crawling a soft daily practice exam questions] source code stored in the database sharing, this time focusing on the development of ideas and processes

Crawling causes of the questions (skippable)

Before always thought that even without higher qualifications, ability, learning ability on it. However, the actual situation, when as a parent, faced with their children to school, to the integration of mandatory requirements, I embarked on a journey exam. Soft test, code as agriculture, saying it was a reward for us can not be overemphasized. The use of fragmented time to brush questions, then found a daily practice of the Greek Competition Network, but, for me, is not easy, so have the urge to crawl the questions

Landing page analysis [process]

1, select the subject you're interested in doing a test mode a daily practice of the questions, then submit, jump to the questions Tab page. This process down, test analysis page is what we ultimately want data contains questions, answers and analysis.

2, I feel crawling data, locate the target data is the first and the last step after the whole operating results python. According to access the page progressive layers to obtain: a daily practice list of subjects, there are pagination - "start doing questions (continue to do title) -" choice exam mode - "Click I want to carry out an assignment -" a daily practice test questions list

Landing page list data analysis

+ + + + Open https://www.educity.cn/tiku/dp100110011003-1.html list page

Multiple clicks contrast change url pages:

https://www.educity.cn/tiku/dp100110011003-2.html
https://www.educity.cn/tiku/dp100110011003-3.html
https://www.educity.cn/tiku/dp100110011003-4. html
available page of pages position

F12 to open the browser console:

The preferred

Guess you like

Origin blog.csdn.net/u013252962/article/details/103635748