Install R and RHadoop on Cloudera Hadoop CDH (rhdfs/rmr2/rhbase/RHive)

Retrieved from: http://www.geedoo.info/installed-on-the-cloudera-hadoop-cdh-r-and-rhadoop-rhdfs-rmr2-rhbase-rhive.html

Preface: RHadoop is an open source project initiated by Revolution Analytics, which combines the statistical language R with Hadoop. At present, the project includes three R packages, namely rmr, which supports writing MapReduce applications in R, rhdfs, which is used for R language to access HDFS, and rhbase, which is used for R language to access HBASE.

1. System and required software version

Server OS: CentOS 6.3

R language version: R-2.15.3 (I used the latest version of R-3 before, and found that the new version has various incompatibility problems, so I chose the latest version of R-2)

Download address: http://ftp.ctex.org/mirrors/CRAN/src/base/R-2/R-2.15.3.tar.gz

Cloudera Hadoop CDH version: 4.4.0

JDK version: 1.6.0_31

Use the cloudera-manager-installer.bin installation package of the free version of Cloudera Manager to complete the installation of CDH and JDK. For details, see CDH installation.

Download address: https://ccp.cloudera.com/display/SUPPORT/Cloudera+Manager+Free+Edition+Download

rJava (java can call R and can be installed using CRAN) version: rJava_0.9-5

Download address: http://www.rforge.net/src/contrib/rJava_0.9-5.tar.gz

RHadoop version, the latest official version, the project address ( https://github.com/RevolutionAnalytics ), including the projects are as follows:

Download address: https://github.com/RevolutionAnalytics/RHadoop/wiki/Downloads

Documentation: https://github.com/RevolutionAnalytics/RHadoop/wiki

2. Dependency installation (R language package, rJava package)

Before installation, you need to install the R language package and rJava package one by one on each host of the cluster, and then install Rhadoop. The specific installation steps are as follows:

1. Install the R language package

Before compiling R, you need to install the following programs through yum:

# yum install gcc-gfortran

Otherwise report "configure: error: No F77 compiler found" error

# yum install gcc gcc-c++

Otherwise report "configure: error: C++ preprocessor "/lib/cpp" fails sanity check" error

# yum install readline-devel

否则报”–with-readline=yes (default) and headers/libs are not available”错误

# yum install libXt-devel

否则报”configure: error: –with-x=yes (default) and X11 headers/libs are not available”错误

Then download the source code, compile

# wget http://cran.rstudio.com/src/base/R-2/R-2.15.3.tar.gz

# tar -zxvf R-2.15.3.tar.gz

# cd R-2.15.3

# ./configure –prefix=/usr –disable-nls –enable-R-shlib/** (the last two options –disable-nls –enable-R-shlib are for the RHive mount, if you don’t install RHive you can omit) */

# make

# make install

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326817443&siteId=291194637