The cluster data of hadoop2 stores the copy in hadoop3

In a Hadoop cluster, HDFS copies are distributed and stored on different nodes. Therefore, if your HDFS path is in a Hadoop2 cluster, you can add new nodes to the Hadoop3 cluster and add copies of data on these new nodes to the Hadoop3 cluster.

Here are some common methods:

1. Add a new node
First, add a new node in the Hadoop3 cluster. New nodes can be added to a Hadoop3 cluster with the following command:

hdfs dfsadmin -addNodes <new node>
上述命令将添加新节点到节点管理器中。
  1. Update Node Configuration
    Then, the HDFS configuration file in Hadoop2 needs to be updated to add the IP or hostname of the new node in Hadoop3 to the datanode node list in the configuration file. On the Hadoop2 cluster, open the HDFS configuration file with the following command:

     hdfs getconf -confKey dfs.namenode.name.dir
     上面的命令将返回 namenode 在 Hadoop2 中的配置文件路径。找到此路径并打开该文件,检查是否已允许包含多个 datanode 节点。
    
  2. Start the Hadoop3 datanode
    Next, start the new node on the Hadoop3 cluster. After the datanode is started, the Hadoop3 cluster will create a copy on the new node, and you can copy the files in Hadoop2 to the Hadoop3 cluster to create a copy.

     复制数据
     最后,将 Hadoop2 中的数据分配给 Hadoop3 集群上的相应 datanode 节点,以创建副本。使用以下命令将文件拷贝到 Hadoop3 集群:
     
     hadoop distcp hdfs://hadoop2/<path to data> hdfs://hadoop3/<new data directory>/.
    

The above command will create a new data directory in Hadoop3 and copy the data from Hadoop2 into the new directory. This command also creates a copy of the file in Hadoop3.

Using the method described above, you can copy the files from Hadoop2 to the Hadoop3 cluster, and you can also add more nodes to the new Hadoop3 cluster for better performance and reliability.

Guess you like

Origin blog.csdn.net/weixin_43015677/article/details/132354062