Yiwen teaches you how to make a simple translation tool with Spider

  Hello everyone, I am not warm Bu fire, School of Computer Science is a big data professional junior, nickname comes from the phrase - 不温不火, which was intended 希望自己性情温和. As a novice in the Internet industry, a blogger writes a blog to record his own learning process on the one hand, and on the other hand to summarize the mistakes he has made, hoping to help many young people who are just like him in their infancy. However, due to the limited level, there will inevitably be some mistakes in the blog. If there are any omissions, please let me know! For the time being, it will only be updated on the csdn platform, the blog homepage: https://buwenbuhuo.blog.csdn.net/ .
1

PS:由于现在越来越多的人未经本人同意直接爬取博主本人文章,博主在此特别声明:未经本人允许,禁止转载!!!


2


Ok! Please forgive the blogger for not being punctual recently. As for the reason~ Due to the crazy overdraft some time ago, now the blogger is really pale by comparison. There is a sentence that perfectly verifies the current situation of bloggers: there is really no drop -. -Here, the blogger just wants to say: Young people don’t know how expensive they are, but they are always crying when they are always looking. Ah sorry to drive 0.0

Although the energy is lacking, the update speed may be slower than before. But the blogger hereby guarantees that it is impossible to break, and it is impossible to break it in this life. During this period of time, the update Scrapy_Spiderpart may be suspended first Spider_Web, and some small demos updated first are for everyone. As for why Scrapy_Spiderthis part is suspended ? Don't ask, ask is precipitation-. -

Hahaha, well, no more nonsense. The following feature film begins...


Before starting to make a simple translation tool, we need to clarify which translation interface we use.

The blogger chose the interface of Baidu Translation this time.

The following is the URL translated by Baidu: https://fanyi.baidu.com/

However, by looking at the web page structure, we found that this URL is not what we need, so we need to find an interface.
3

1. Get the request interface of Baidu translation

  • 1. Open the browser F12 to open the source code of Baidu translation webpage

4
If the above picture occurs, we can lose a few more times https://fanyi.baidu.com/sug. As shown below:
5

  • 2. Find the parameter from the request whose method is POST: kw:hi (hi is the content of the input translation)

6
7
From the above figure, we can easily see that it datais a list, which is stored as key-value pairs, and there are several words and meanings, and only the first one is what we need, then we can go to the first The value of a key-value pair can be:["data"][0]["v"])

2. Writing ideas

Now that the interface has been found, the next step is to analyze how to write code. Writing code generally requires the following steps:

  • 1. First, we need to set a request header to simulate it as a browser, which is the most basic countermeasure
headers = {
    
    
        "user-agent": "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36",
    }
  • 2. Send a post request, get json, and convert it into a dictionary
    #发送post请求
    response = requests.post(url=url,params=params,headers=headers)
    #获取返回内容,这里是json,获取json数据转字典
    content = response.json()
    #获取数据
    print(content)

8

  • 3. Get word meaning
print(content["data"][0]["v"])

9

3. Original program and package modification program

  • 1. Original program
#!/usr/bin/env python
# encoding: utf-8
'''
  @author 李华鑫
  @create 2020-10-06 11:23
  Mycsdn:https://buwenbuhuo.blog.csdn.net/
  @contact: [email protected]
  @software: Pycharm
  @file: baidu翻译.py
  @Version:1.0
  
'''
import requests

url = "https://fanyi.baidu.com/sug"
data = {
    
    
    "kw":input(">")
}
headers = {
    
    
    "user-agent": "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36",
}
#发送post请求
response = requests.post(url=url,data=data,headers=headers)
#获取返回内容,这里是json,获取json数据转字典
content = response.json()
#获取数据
print(content["data"][0]["v"])
  • 2. Basic packaged program
#!/usr/bin/env python
# encoding: utf-8
'''
  @author 李华鑫
  @create 2020-10-06 11:23
  Mycsdn:https://buwenbuhuo.blog.csdn.net/
  @contact: [email protected]
  @software: Pycharm
  @file: baidu翻译.py
  @Version:1.0
  
'''
import requests
def baidufanyi():

    headers = {
    
    
        "user-agent": "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36",
    }
    #发送post请求
    response = requests.post(url=url,params=params,headers=headers)
    #获取返回内容,这里是json,获取json数据转字典
    content = response.json()
    #获取数据
    # print(content)
    print('*'*100)
    print(content["data"][0]["v"])
    # 遍历打印出所有的查询的单词及相近单词和注释
    # for k in content["data"]:
    #     print(k["k"],k["v"])
    print('*' * 100)


if __name__ == '__main__':
    while True:
        # sug有些不出现,这时候需要多输入些内容
        url = "https://fanyi.baidu.com/sug"
        params = {
    
    
            "kw": input("请输入单词:")
        }
        baidufanyi()
  • 3. Running result graph

10
The good days are always short. Although I still want to continue to talk with you, this blog post is over now. If it is not enough, don’t worry, see you next time!


12

  A good book never tires of reading a hundred times. And if I want to be the most beautiful boy in the audience, I must insist on acquiring more knowledge through learning, using knowledge to change my destiny, using blogs to witness growth, and using actions to prove my hard work.
  If my blog is helpful to you, if you like the content of my blog, please “点赞” “评论”“收藏”click three links! I heard that people who like it won’t have bad luck and will be full of energy every day! If you really want to be a prostitute, I wish you happy every day, and welcome to my blog.
  The code word is not easy, and your support is my motivation to stick to it. Don't forget 关注me after you like it!

13
22

Guess you like

Origin blog.csdn.net/qq_16146103/article/details/108964249