Docker creates CentOS7 and installs DataX

Environmental preparation

I chose centos7 docker pull centos:centos7, which already comes with python27, so I only need to install jdk8.

Download jdk from Oracle official website , pay attention to choose ARM, x86, x64, mine is x64

  1. docker run -it --name hello_datax centos:centos7Start the container
  2. mkdir modulesAnd mkdir softwares, create a directory to put the compressed package and pressurized folder
  3. docker cp E:\MyWork\MyDevelopmentTools\Java\jdk-8u271-linux-x64.tar.gz hello_datax:/opt/softwaresPass the compressed package into the container
  4. tar -zxvf /opt/softwares/jdk-8u271-linux-x64.tar.gz -C /opt/modules/Unzip to the specified location
  5. Add environment variables vi /root/.bashrcand cannot be placed in /etc/profile, because docker will fail every time the container is restarted
# JAVA_HOME
export JAVA_HOME=/opt/modules/jdk1.8.0_271
export PATH=$PATH:$JAVA_HOME/bin:$JAVA_HOME/sbin
  1. java -versionSuccessfully installed jdk8
    Insert picture description here

Download and install

  1. Download DataX ,
  2. docker cp E:\MyWork\MyDevelopmentTools\datax.tar.gz hello_datax:/opt/softwaresCopy to container
  3. tar -zxvf /opt/softwares/datax.tar.gz -C /opt/modules/Unzip to the specified location
  4. cd /opt/modules/dataxEnter the directory under data
  5. python ./bin/datax.py ./job/job.jsonExecute test job
    Insert picture description here

Solve Chinese Garbled in CentOS7

Write test

  1. vi ./job/stream2stream.jsonThe same is to create a file under the job in the datax directory, and write the following content
{
  "job": {
    "content": [
      {
        "reader": {
          "name": "streamreader",
          "parameter": {
            "sliceRecordCount": 10,
            "column": [
              {
                "type": "long",
                "value": "10"
              },
              {
                "type": "string",
                "value": "hello,你好,世界-DataX"
              }
            ]
          }
        },
        "writer": {
          "name": "streamwriter",
          "parameter": {
            "encoding": "UTF-8",
            "print": true
          }
        }
      }
    ],
    "setting": {
      "speed": {
        "channel": 5
       }
    }
  }
}
  1. python ./bin/datax.py ./job/stream2stream.jsoncarried out
    Insert picture description here

Guess you like

Origin blog.csdn.net/weixin_44112790/article/details/110182051