The use of Selenium technology in the Ubuntu container of the docker image on the Tencent Cloud Server of the CentOS6.8 system (in the Linux environment)

1. Explanation

When I read the title at the beginning, I know you will be very confused. When I finished it, I was also very confused. I mainly wanted to use Selenium technology to complete some work in the Linux environment. I originally planned to use docker to create an Anaconda container for use. Selenium technology, but found that Google Chrome cannot be driven by the chrome driver, so I created an Ubuntu container with docker to use Selenium technology. As for why I don’t use the CentOS system directly, it is mainly a third-party library in the native environment There is really no way to not cooperate with your own code, and you are also worried that it will interfere with other things, so let's do it in the container.
In addition, the author assumes that everyone is familiar with docker and shell commands. If you don’t understand, it is recommended to read docker and shell first.

2. Linux related commands during operation

1. Preliminary preparation (mirror, container)

1.1 Search Ubuntu mirror

docker search ubuntu

1.2 Find the highest rated ubuntu download

docker pull ubuntu

1.3 View all images on the server

docker images

1.4 The image download is successful, and the container starts to run (the path is configured by yourself)

docker run -itd --privileged --name ubuntu -p 9201:9200 -v /宿主机路径:/ubuntu/python ubuntu /bin/bash

Here I have mounted it. If it is not mounted, my python code will not be used in the container. At the same time, the subsequent uploaded files will be placed in the mounted folder.

2. Download Ubuntu related packages and install Python3

2.1 Enter the Ubuntu container

docker exec -it ubuntu /bin/bash

2.2 Update source

apt-get update

2.3 install apt-utils

apt-get install -y apt-utils

2.4 download python3

apt-get install -y python3 python3-dev python3-setuptools

2.5 Download the pip tool of python3

apt-get install -y python3-pip

2.6 Update pip to a newer version

pip3 install --upgrade pip -i https://pypi.tuna.tsinghua.edu.cn/simple/

2.7 Install Ubuntu related dependencies 1

apt-get install -y gcc make build-essential

2.8 Install Ubuntu related dependencies 2

apt-get install -y libbz2-dev libncurses5-dev libgdbm-dev liblzma-dev sqlite3 libsqlite3-dev openssl libssl-dev tcl8.6-dev tk8.6-dev libreadline-dev zlib1g-dev curl

2.9 Install Ubuntu related dependencies 3

pip3 install --upgrade setuptools -i https://pypi.tuna.tsinghua.edu.cn/simple/

2.10 Install the required python third-party library (pass the file yourself)

The names of the third-party libraries required by python are all in requirements.txt to avoid version mismatches of third-party libraries (in addition, you must first upload requirements.txt to the mounted place in the server through xftp, and enter the In the mounted folder, and then execute the command, otherwise it will not work)

pip3 install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple/

3. Install Google Chrome

3.1 Install the wget tool

apt-get install -y wget gnupg2

3.2 Reference URL

Reference document: Install 64-bit Google Chrome browser under ubuntu16.04

3.3 Add the download source to the source list of the system

wget http://www.linuxidc.com/files/repo/google-chrome.list -P /etc/apt/sources.list.d/

3.4 Import the public key of Google software

wget -q -O - https://dl.google.com/linux/linux_signing_key.pub  | apt-key add -

3.5 Update again

apt-get update

3.6 Executing the installation of Google Chrome (stable version)

apt-get install google-chrome-stable

3.7 View Google Chrome version

google-chrome --version

If the version number of Google Chrome appears, it proves success.

4. Install Google Drive

4.1 Google Drive URL (find the drive by yourself)

Google driver link: Find the driver of the corresponding version of Google browser in the Google driver URL, select the linux version, download it to the local, and then upload it to the place where the server is mounted through xftp, and then modify the driver location in the python code, you can
tested.

4.2 Give the chrome driver execution permission

Enter the mounted folder first, and then give permission, otherwise this file cannot be found

chmod +x chromedriver

5. Test

5.1 Upload the python code to the mounted folder via xftp

5.2 Give python file execution permission (pass the code yourself)

Enter the mounted folder first, and then give permission, otherwise this file cannot be found

chmod u+x test.py

5.3 Formal test code

First enter the mount folder where test.py is stored, and then execute the command, otherwise it will not work, and a log file of log.log must be created at the same time

nohup python3 -u test.py > log.log 2>&1 &

This shell command roughly means: execute the python code in the background, store the log in log.log, and output it in time. (You can search online, there are all)

6. Test code

#!/usr/bin/python3
#coding:utf-8
# 浏览器
from selenium import webdriver
# 规避检测
from selenium.webdriver import ChromeOptions
# 无头浏览器
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By

ch_options = webdriver.ChromeOptions()
#为Chrome配置无头模式
ch_options.add_argument("--headless")  
ch_options.add_argument('--no-sandbox')
ch_options.add_argument('--disable-gpu')
ch_options.add_argument('--disable-dev-shm-usage')
# 在启动浏览器时加入配置,这个驱动路径是容器里对应的路径,不是宿主机的路径
dr = webdriver.Chrome(service=Service("/ubuntu/python/chromedriver"),options=ch_options)
#这是测试网站
url = "https://www.baidu.com"
dr.get(url)
#打印源码
print(dr.page_source)

3. Digression

I have followed these steps step by step, and all the positions that should be reminded have been reminded, and the others should be executed directly, and it should be successful. Then I tried to configure it on ubuntu20.04, and there was no big problem, because I also did it when I encountered this problem recently. It is very cumbersome, but it is useful. It is not guaranteed to be useful for everyone, but if there is a problem, you can Feedback in time and work together to make progress together.

Guess you like

Origin blog.csdn.net/qq_46106857/article/details/130531481