Python automation combat, automatically log in and send Weibo

1. Software preparation

1. Install the Python environment

First, you need to install the Python environment on your computer and install the Python development tools.

If you haven't installed it yet, you can refer to the following articles:

If you only use Python to process data, crawlers, data analysis or automated scripts, machine learning, etc., it is recommended to use the Python basic environment + jupyter. For installation and use, refer to Windows/Mac installation, using the Python environment + jupyter notebook

If you want to use Python for web project development, etc., it is recommended to use Python basic environment + Pycharm. For installation and use, please refer to: Installation  and  use of Pycharm tutorial under Windows  .

2. Install the selenium library

pip install selenium

3. Download Google Chrome driver chromedriver , download address: http://npm.taobao.org/mirrors/chromedriver/

You need to select the corresponding Google Chrome version, (Google Chrome access: chrome://settings/help, you can view the version)

View version

After downloading, just send it to a path (simple is best, remember the path).

2. Implementation method

2.1 Use Selenium tools to automate the simulation of the browser, the current focus is on understanding the positioning of elements

If you don't know anything about Selenium, readers who want to learn the basics can also read this article first: 20,000 words will take you to understand the whole strategy of Selenium

If we want to locate an element, we can locate it by id, name, class, tag, all text on the link, part of the text on the link, XPath or CSS. Selenium Webdriver also provides these 8 methods for us to locate elements.

1) Locate by id: We can use the find_element_by_id() function. For example, if we want to locate the element with id=loginName, we can use browser.find_element_by_id("loginName").

2) Locate by name: We can use the find_element_by_name() function. For example, if we want to locate the element with name=key_word, we can use browser.find_element_by_name(“key_word”).

3) Locate by class: You can use the find_element_by_class_name() function.

4) Locate by tag: use the find_element_by_tag_name() function.

5) By complete text positioning on the link: use the find_element_by_link_text() function.

6) Locate by partial text on link: use the find_element_by_partial_link_text() function. Sometimes the text on the hyperlink is very long, and we can locate it by looking for part of the text content.

7) Locate by XPath: Use the find_element_by_xpath() function. The generality of using XPath positioning is better, because when the id, name, and class are multiple, or the element does not have these attribute values, XPath positioning can help us complete the task.

8) Positioning by CSS: Use the find_element_by_css_selector() function. CSS positioning is also a commonly used positioning method, which is more concise than XPath.

2.2 Operations on elements include

1) Clear the content of the input box: use the clear() function;

2) Enter content in the input box: use the send_keys(content) function to pass in the text to be entered;

3) Click the button: use the click() function, if the element is a button or link, you can click the operation;

4) Submit form: use the submit() function, when the element object is a form, the form can be submitted;

2.3 Attention

Since the chrome opened by selenium is originally set, when accessing the Weibo homepage, a pop-up window will pop up whether to prompt the message, so that the input box cannot be located. The popup can be closed using the following methods:

prefs = {"profile.default_content_setting_values.notifications": 2}

2.4 How to position elements

Click on the element that needs to be positioned, then right-click and select Inspect to bring up Google Developer Tools.

Get the xpath path, click the small key head (select element) in the upper left corner of the Google Developer Tools, select the place you want to view, the developer tools will automatically locate the source code location of the corresponding element, select the corresponding source code, then right-click, select Copy-> Copy XPath

You can get the xpath path.

In addition:  You can download the XPath Helper plug-in, select the element you want to extract on the web page after installation, right-click and select inspect and the developer tools will automatically open you can see the HTML code, select and right-click again, and select copy to xpath in copy This will get the value of xpath.

3. Complete code

Implementation idea:  In fact, it is the same as our normal operation at ordinary times, but here, the whole process is realized by selenium, simulating click and input, so the whole process is: open the login page -> enter the account password -> click the login button -> send Weibo Enter the content to send in the box -> click the send button -> close the browser (optional).

3.1 At present, the login protection may pop up when the account is automatically entered. Scan the QR code for verification.

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
import time


'''
自动发布微博
content:发送内容
username:微博账号
password:微博密码
'''
def post_weibo(content, username, password):
    # 加载谷歌浏览器驱动
    path = r'C:/MyEnv/chromedriver.exe '  # 指定驱动存放目录
    ser = Service(path)
    chrome_options = webdriver.ChromeOptions()
    # 把允许提示这个弹窗关闭
    prefs = {"profile.default_content_setting_values.notifications": 2}
    chrome_options.add_experimental_option("prefs", prefs)
    driver = webdriver.Chrome(service=ser, options=chrome_options)
    driver.maximize_window()  # 设置页面最大化,避免元素被隐藏  
    
    print('# get打开微博主页')
    url = 'http://weibo.com/login.php'
    driver.get(url)  # get打开微博主页
    time.sleep(5)  # 页面加载完全
    
    print('找到用户名 密码输入框')
    input_account = driver.find_element_by_id('loginname')  # 找到用户名输入框
    input_psw = driver.find_element_by_css_selector('input[type="password"]')  # 找到密码输入框
    # 输入用户名和密码
    input_account.send_keys(username)
    input_psw.send_keys(password)
    
    print('# 找到登录按钮 //div[@node-type="normal_form"]//div[@class="info_list login_btn"]/a')
    bt_logoin = driver.find_element_by_xpath('//div[@node-type="normal_form"]//div[@class="info_list login_btn"]/a')  # 找到登录按钮
    bt_logoin.click()  # 点击登录
    # 等待页面加载完毕  #有的可能需要登录保护,需扫码确认下
    time.sleep(40)

    # 登录后 默认到首页,有微博发送框
    print('# 找到文本输入框 输入内容 //*[@id="homeWrap"]/div[1]/div/div[1]/div/textarea')
    weibo_content = driver.find_element_by_xpath('//*[@id="homeWrap"]/div[1]/div/div[1]/div/textarea')
    weibo_content.send_keys(content)
    print('# 点击发送按钮 //*[@id="homeWrap"]/div[1]/div/div[4]/div/button')
    bt_push = driver.find_element_by_xpath('//*[@id="homeWrap"]/div[1]/div/div[4]/div/button')
    bt_push.click()  # 点击发布
    time.sleep(15)
    
    driver.close()  # 关闭浏览器

if __name__ == '__main__':
    username = '微博用户名'
    password = "微博密码"
    # 自动发微博
    content = '每天进步一点'
    post_weibo(content, username, password)

To log in through a cookie, you can skip scanning the code to log in. After the cookie expires, you can get the cookie again.

Import third-party packages

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
import time
import requests
import json

Get cookies locally

Here, the get_cookies function of selenium is mainly used to obtain cookies.

# 获取cookies 到本地
def get_cookies(driver):
    driver.get('https://weibo.com/login.php')
    time.sleep(20) # 留时间进行扫码
    Cookies = driver.get_cookies() # 获取list的cookies
    jsCookies = json.dumps(Cookies) # 转换成字符串保存
    with open('cookies.txt', 'w') as f:
        f.write(jsCookies)
    print('cookies已重新写入!')
    

# 读取本地的cookies
def read_cookies():
    with open('cookies.txt', 'r', encoding='utf8') as f:
        Cookies = json.loads(f.read())
    cookies = []
    for cookie in Cookies:
        cookie_dict = {
            'domain': '.weibo.com',
            'name': cookie.get('name'),
            'value': cookie.get('value'),
            'expires': '',
            'path': '/',
            'httpOnly': False,
            'HostOnly': False,
            'Secure': False
        }
        cookies.append(cookie_dict)
    return cookies

Use cookies to log in to Weibo and send text complete code

# 初始化浏览器 打开微博登录页面
def init_browser():
    path = r'C:/MyEnv/chromedriver.exe '  # 指定驱动存放目录
    ser = Service(path)
    chrome_options = webdriver.ChromeOptions()
    # 把允许提示这个弹窗关闭
    prefs = {"profile.default_content_setting_values.notifications": 2}
    chrome_options.add_experimental_option("prefs", prefs)
    driver = webdriver.Chrome(service=ser, options=chrome_options)
    driver.maximize_window()    
    driver.get('https://weibo.com/login.php')
    return driver
    
    
# 读取cookies 登录微博
def login_weibo(driver):
    cookies = read_cookies()
    for cookie in cookies:
        driver.add_cookie(cookie)
    time.sleep(3)
    driver.refresh()  # 刷新网页

# 发布微博
def post_weibo(content, driver):
    time.sleep(5)
    weibo_content = driver.find_element_by_xpath('//*[ @id ="homeWrap"]/div[1]/div/div[1]/div/textarea')
    weibo_content.send_keys(content)
    bt_push = driver.find_element_by_xpath('//*[@id="homeWrap"]/div[1]/div/div[4]/div/button')
    bt_push.click()  # 点击发布
    time.sleep(5)
    driver.close()  # 关闭浏览器

    
if __name__ == '__main__':
    # cookie登录微博
    driver = init_browser()
    login_weibo(driver)
    # 自动发微博
    content = '今天的天气真不错~'
    post_weibo(content, driver)

, duration 00:22

Extension: Check the validity of cookies

Detection method: Use local cookies to send a get request to Weibo. If the returned page source code contains your own Weibo nickname, it means that the cookies are still valid, otherwise they are invalid.

You can only have your own Weibo nickname when you are logged in

# 检测cookies的有效性
def check_cookies():
    # 读取本地cookies
    cookies = read_cookies()
    s = requests.Session()
    for cookie in cookies:
        s.cookies.set(cookie['name'], cookie['value'])
    response = s.get("https://weibo.com")
    html_t = response.text
    # 检测页面是否包含我的微博用户名
    if '老表max' in html_t:
        return True
    else:
        return False

Expansion: Scheduled to send automatically every day

You can refer to the previous article: How to use Python to send alarm notifications to DingTalk?

Including how to set up the daemon is also covered in the previous article.

from apscheduler.schedulers.blocking import BlockingSchedulera

'''
每天早上9:00 发送一条微博
'''
def every_day_nine():
    # cookie登录微博
    driver = init_browser()
    login_weibo(driver)
    req = requests.get('https://hitokoto.open.beeapi.cn/random')
    get_sentence = req.json()
    content =  f'【每日一言】{get_sentence["data"]} 来自:一言api'
    # 自动发微博
    post_weibo(content, driver)
    

    
# 选择BlockingScheduler调度器
sched = BlockingScheduler(timezone='Asia/Shanghai')

# job_every_nine 每天早上9点运行一次  日常发送
sched.add_job(every_day_nine, 'cron', hour=9)

# 启动定时任务

Here I would like to recommend the Python learning Q group I built by myself: 831804576. Everyone in the group is learning Python. If you want to learn or are learning Python, you are welcome to join. Everyone is a software development party and shares dry goods from time to time ( Only related to Python software development),
including a copy of the latest Python advanced materials and zero-based teaching in 2021 that I have compiled by myself. Welcome to the advanced middle and small partners who are interested in Python!
 

Guess you like

Origin blog.csdn.net/BYGFJ/article/details/124115104