1. nutch-site.xml的变更不需要重新ant, 与ycs的说法有误
2. nutch-site.xml中的
<property>
<name>http.agent.name</name>
<value>Mozilla/5.0 (Windows NT 6.1; rv:20.0) Gecko/20100101 Firefox/20.0</value>
<description>HTTP 'User-Agent' request header. MUST NOT be empty -
please set this to a single word uniquely related to your organization.
NOTE: You should also check other related properties:
http.robots.agents
http.agent.description
http.agent.url
http.agent.email
http.agent.version
and set their values appropriately.
</description>
</property>
其中<value></value>要有同一行,不然会出现fetch www.amazon.cn,www.vancl.com 不到东西的情况。非常怪异的情况
nutch nutch-site.xml
猜你喜欢
转载自john-doe.iteye.com/blog/1860070
今日推荐
周排行