nutch2.3分布式搭建

首先搭建hadoop集群。参看之前的文章:

https://note.youdao.com/share/?token=6B7AD80F6F904C1982B92E03C61B637C&gid=30499526

首先把/hadoop/etc/hadoop下面的几个文件复制到/nutch-2.3.1/conf

core-site.xml
hadoop-env.sh
hbase-site.xml
hdfs-site.xml
masters(没有就新建吧,里面填Hmaster的地址)
slaves

然后把Hbase/lib 下面的*.jar 复制到 hadoop/share/hadoop/mapreduce

vim /nutch-2.3.1/conf/nutch-site.xml

添加

<property>
   <name>plugin.folders</name>
   <value>/opt/apache-nutch-2.3.1/build/plugins</value>
</property

然后把nutch拷贝到其他机器

可能出现的问题:
Container killed on request. Exit code is 143

然后提示memory 2.7g in 2.1g used
这里是表示内存不够,所以

vim hadoop/etc/hadoop/mapred.site

<property>
    <name>mapreduce.map.memory.mb</name>
    <value>4096</value>
  </property>
  <property>
    <name>mapreduce.reduce.memory.mb</name>
    <value>4096</value>
  </property>

猜你喜欢

转载自blog.csdn.net/a511310132/article/details/76142169