Little knowledge points: Airflow installation and deployment

insert image description here

Part of the installation method reference : Linux virtual machine: Big data cluster basic environment construction (Hadoop, Spark, Flink, Hive, Zookeeper, Kafka, Nginx)

1. Python installation

The current installed version is Python-3.9, using the source package to install

  • Download source package or wget download
wget https://www.python.org/ftp/python/3.9.6/Python-3.9.6.tgz
  • Unzip to the specified directory
tar -zxvf Python-3.9.6.tgz
  • Depend on environment installation
sudo yum -y install vim unzip net-tools && sudo yum -y install wget && sudo yum -y install bzip2 && sudo yum -y install zlib-devel bzip2-devel openssl-devel ncurses-devel sqlite-devel readline-devel tk-devel gdbm-devel db4-devel libpcap-devel xz-devel && sudo yum -y install libglvnd-glx && sudo yum -y install gcc gcc-c++
  • pre-configured
cd Python-3.9.6
./configure --prefix=/xxx/program/python3
  • Compile and install
make && make install
  • Configure environment variables or put the python3 soft link in /usr/bin
sudo ln -s /xx/xx/python3.9 /usr/bin/python3.9
sudo ln -s /xx/xx/pip3.9 /usr/bin/pip3.9

2. Airflow installation

  • update pip
pip3.9 install --upgrade pip -i http://pypi.douban.com/simple/ --trusted-host pypi.douban.com
  • update setuptools
pip3.9 install --upgrade pip -i http://pypi.douban.com/simple/ --trusted-host pypi.douban.com
  • download airflow
pip3.9 install apache-airflow -i http://pypi.douban.com/simple/ --trusted-host pypi.douban.com

insert image description here

  • Configure airflow environment commands
sudo ln -s /xxx/python-3.9.6/bin/airflow /usr/bin/airflow

3. Airflow configuration

Here I am using mysql5.7 installed by docker. For details, please refer to: Linux Virtual Machine: Building a Basic Environment for Big Data Clusters (Hadoop, Spark, Flink, Hive, Zookeeper, Kafka, Nginx)
Official Document: Setting Up a MySQL Database

  • download connection driver
pip3.9 install mysql-connector-python -i http://pypi.douban.com/simple/ --trusted-host pypi.douban.com
  • Modify the airflow.cfg file
    • This file is in the home directory of the corresponding user, that is, ~/airflow/airflow.cfg
    • Find the [database] module, comment out the previous sqlite, and add a new mysql configuration
sql_alchemy_conn = mysql+mysqlconnector://root:123456@hybrid03:3306/airflow_db

insert image description here

  • Create a MySQL database
create database airflow_db character set utf8mb4 collate utf8mb4_unicode_ci;
  • Modify the configuration file my.cnf
    • My.cnf in mysql5.7 installed by docker is in /etc/my.cnf, you can copy it to the mounted configuration folder, modify it through the host machine, and then overwrite the original one (there is no vi/vim in the docker container)
    • Add the following configuration in my.cnf [mysqld], restart mysql after modification (docker restart mysql)
explicit_defaults_for_timestamp=1
  • Initialize airflow db
airflow db init
  • Modify the local scheduler
    • Modify airflow.cfg, comment out the original configuration, and change it to executor = LocalExecutor

insert image description here

  • create user
airflow users create --username admin --firstname admin --lastname admin --role Admin --email [email protected]
  • start up
airflow webserver -p 8080 -D
airflow scheduler -D
  • stop
ps -ef | egrep 'scheduler|airflow-webserver'| grep -v grep | awk '{print $2}' | xargs kill -15

4. Existing pits

  • The downloaded airflow command is in the bin directory under the Python installation directory. If the environment variable is not configured, it will be assumed that there is no such command
  • At the beginning, setuptools was not updated, and error: subprocess-exited-with-error was reported in some places. After updating setuptools later, it was installed smoothly.
Collecting unicodecsv>=0.14.1
  Downloading http://pypi.doubanio.com/packages/6f/a4/691ab63b17505a26096608cc309960b5a6bdf39e4ba1a793d5f9b1a53270/unicodecsv-0.14.1.tar.gz (10 kB)
  Preparing metadata (setup.py) ... error
  error: subprocess-exited-with-error

  × python setup.py egg_info did not run successfully.exit code: 1
  ╰─> [1 lines of output]
      ERROR: Can not execute `setup.py` since setuptools is not available in the build environment.
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

insert image description here

  • Using the airflow command to report an error: airflow.exceptions.AirflowConfigException: error: sqlite C library version too old (< 3.15.0)
    • Airflow uses sqlite as the database by default. The default version of the system is too low. If you want to use this database, you need to update it manually. It is generally recommended to change it to MySQL

insert image description here

  • sqlalchemy.exc.ProgrammingError: (mysql.connector.errors.ProgrammingError) 1067 (42000): Invalid default value for ‘updated_at’
    • Reason: The field 'update_at' is of timestamp type, and the default value is null, so it fails
    • Modify MySQL my.cnf, refer to the above steps

insert image description here

Guess you like

Origin blog.csdn.net/baidu_40468340/article/details/128920523