By default, Airflow uses a SQLite database to store meta information. SQLite does not support multiple connections, and only supports sequential execution by default. Here I will use MySQL for parallel execution.
MySQL Installation Guide u003du003d>MySQL-Setup
Airflow-MySQL Settings:
-
Open a terminal and execute
-
mysql -u root -p
-
mysql> create database airflow;
-
mysql> create user 'airflow'@'localhost' identified by 'airflow';
-
mysql> GRANT ALL PRIVILEGES ON airflow. * TO 'airflow'@'localhost';
-
mysql> refresh privilege;
-
Airflow needs a home, ~/airflow is the default, but you can base it elsewhere if you want
export AIRFLOW_HOME u003d ~/airflow
- Install airflow using pip
sudo pip install apache-airflow
- Create subfolders for your dags
mkdir ~/airflow/dags
-
Change airflow configuration for parallel execution
-
Open airflow.cfg, which exists in your airflow home
-
Change executor and database
-
Executor u003d Local Executor
-
sql_alchemy_conn u003dmysql://airflow:airflow@localhost:3306/airf..
-
Initialize the database
Airflow initializes the database
- Start the web server, the default port is 8080
airflow webserver -D
- Start the scheduler
Airflow Scheduler-D
- Visit localhost:8080 in your browser.
localhost:8080