First, what is datax
datax is an open source Alibaba offline data synchronization tool, through the framework. Support synchronize data between heterogeneous data sources.
The user may be structured data (mysql, sqlserver, oracle ...), unstructured data (mongo, hive ...), and synchronize data between the structures of the non-structural easily.
Second, the use datax
2.1 System Environment
- Linux
- JDK (1.8 or higher, 1.8 is recommended)
- Python (recommended Python2.6.X)
- Apache Maven 3.x (Compile DataX)
- git
2.2 deployment
A method for direct download DataX Kit
After downloading unzip to a local directory, go to the bin directory, you can run the sync job:
cd {YOUR_DATAX_HOME}/bin python datax.py {YOUR_JOB.json}
Self-test script
python {YOUR_DATAX_HOME}/bin/datax.py {YOUR_DATAX_HOME}/job/job.json
Normal print log indicating datax can use
Second method, download the source code, compile it yourself
1) Download the source code (not git students own Baidu)
git clone https://github.com/alibaba/DataX.git
2) by maven package (not installed maven students own Baidu)
cd {DataX_source_code_home} mvn -U clean package assembly:assembly -Dmaven.test.skip=true
reference
1.datax github - https://github.com/alibaba/DataX
First, what is datax
datax is an open source Alibaba offline data synchronization tool, through the framework. Support synchronize data between heterogeneous data sources.
The user may be structured data (mysql, sqlserver, oracle ...), unstructured data (mongo, hive ...), and synchronize data between the structures of the non-structural easily.
Second, the use datax
2.1 System Environment
- Linux
- JDK (1.8 or higher, 1.8 is recommended)
- Python (recommended Python2.6.X)
- Apache Maven 3.x (Compile DataX)
- git
2.2 deployment
A method for direct download DataX Kit
After downloading unzip to a local directory, go to the bin directory, you can run the sync job:
cd {YOUR_DATAX_HOME}/bin python datax.py {YOUR_JOB.json}
Self-test script
python {YOUR_DATAX_HOME}/bin/datax.py {YOUR_DATAX_HOME}/job/job.json
Normal print log indicating datax can use
Second method, download the source code, compile it yourself
1) Download the source code (not git students own Baidu)
git clone https://github.com/alibaba/DataX.git
2) by maven package (not installed maven students own Baidu)
cd {DataX_source_code_home} mvn -U clean package assembly:assembly -Dmaven.test.skip=true
reference
1.datax github - https://github.com/alibaba/DataX