Centos7 system builds Hadoop environment tutorial

Setting up a Hadoop environment on CentOS 7 is a common task, here is a simple tutorial:

  1. Install Java:
    Hadoop is developed based on Java, so Java needs to be installed first. You can install Java on CentOS 7 by following these steps:

    • Download the Java JDK (Java Development Kit) tarball for Linux.
    • Extract the tarball and install it to a directory of your choice.
    • Configure the Java environment variable (JAVA_HOME).
  2. Download and extract Hadoop:

    • Visit the official Hadoop website and download the latest version of Hadoop for CentOS 7.
    • Unzip the Hadoop tarball to a directory of your choice.
  3. Configure Hadoop environment variables:

    • Open  ~/.bashrcthe file and add the following lines:

      export HADOOP_HOME=/path/to/hadoop
      export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
    • Run the following command for the environment variable to take effect:

      source ~/.bashrc
  4. Configure the Hadoop cluster:

    • Enter the Hadoop configuration directory:cd $HADOOP_HOME/etc/hadoop
    • Edit  hadoop-env.shthe file to configure the Java path to the correct Java installation path:

      export JAVA_HOME=/path/to/java
    • Edit  core-site.xmlthe file to configure the core settings of Hadoop, such as file system and ports, etc.
    • Edit  hdfs-site.xmlthe file to configure the settings of the Hadoop Distributed File System (HDFS), such as the data directory and number of replicas, etc.
    • Edit  mapred-site.xmlthe file to configure Hadoop MapReduce settings such as task scheduling and executors.
    • Edit  yarn-site.xmlthe file to configure the settings of the YARN resource manager, such as node management and resource allocation.
  5. Start the Hadoop cluster:

    • Format HDFS:hdfs namenode -format
    • Start HDFS:start-dfs.sh
    • Start YARN:start-yarn.sh
  6. Verify the Hadoop cluster:

    • Open a web browser and visit the Hadoop resource manager URL: http://localhost:8088, to confirm that the YARN resource manager is running.
    • Check the status of HDFS:hdfs dfsadmin -report

These are the basic steps to set up a Hadoop environment on CentOS 7. Depending on your needs and specific environment, additional configuration and tuning may be required. Make sure you have a detailed understanding of your network environment and security requirements and take appropriate security measures before performing any operations related to network connection and security.

Guess you like

Origin blog.csdn.net/tiansyun/article/details/131954185