datax first experience datax first experience

First, what is datax

  datax is an open source Alibaba offline data synchronization tool, through the framework. Support synchronize data between heterogeneous data sources.

  The user may be structured data (mysql, sqlserver, oracle ...), unstructured data (mongo, hive ...), and synchronize data between the structures of the non-structural easily.

Second, the use datax

2.1 System Environment

  • Linux
  • JDK (1.8 or higher, 1.8 is recommended)
  • Python (recommended Python2.6.X)
  • Apache Maven 3.x (Compile DataX)
  • git

 

2.2 deployment

A method for direct download DataX Kit

  datax Download

After downloading unzip to a local directory, go to the bin directory, you can run the sync job:

cd  {YOUR_DATAX_HOME}/bin
python datax.py {YOUR_JOB.json}

Self-test script 

python {YOUR_DATAX_HOME}/bin/datax.py {YOUR_DATAX_HOME}/job/job.json

 Normal print log indicating datax can use

 

Second method, download the source code, compile it yourself

  datax source Download

1) Download the source code (not git students own Baidu)

git clone https://github.com/alibaba/DataX.git

 

2) by maven package (not installed maven students own Baidu)

cd {DataX_source_code_home}
mvn -U clean package assembly:assembly -Dmaven.test.skip=true

 

reference

1.datax github - https://github.com/alibaba/DataX

First, what is datax

  datax is an open source Alibaba offline data synchronization tool, through the framework. Support synchronize data between heterogeneous data sources.

  The user may be structured data (mysql, sqlserver, oracle ...), unstructured data (mongo, hive ...), and synchronize data between the structures of the non-structural easily.

Second, the use datax

2.1 System Environment

  • Linux
  • JDK (1.8 or higher, 1.8 is recommended)
  • Python (recommended Python2.6.X)
  • Apache Maven 3.x (Compile DataX)
  • git

 

2.2 deployment

A method for direct download DataX Kit

  datax Download

After downloading unzip to a local directory, go to the bin directory, you can run the sync job:

cd  {YOUR_DATAX_HOME}/bin
python datax.py {YOUR_JOB.json}

Self-test script 

python {YOUR_DATAX_HOME}/bin/datax.py {YOUR_DATAX_HOME}/job/job.json

 Normal print log indicating datax can use

 

Second method, download the source code, compile it yourself

  datax source Download

1) Download the source code (not git students own Baidu)

git clone https://github.com/alibaba/DataX.git

 

2) by maven package (not installed maven students own Baidu)

cd {DataX_source_code_home}
mvn -U clean package assembly:assembly -Dmaven.test.skip=true

 

reference

1.datax github - https://github.com/alibaba/DataX

Guess you like

Origin www.cnblogs.com/chenzl44/p/11449302.html