hadoop生态环境

1、Redhat linux下fluentd+fluentd 实现桥接 http://www.it165.net/os/html/201209/3499.html
2、日志收集系统 Fluentd http://www.linuxdiyf.com/viewarticle.php?id=371318

3、
  3-1、SaikuSaiku是一个模块化的分析套件,它提供了一个RESTful服务器,基于JackRabbit的工作区管理,OAuth安全管理和基于JSON或XML的数据通信。其目标是让你在应用程序中嵌入数据分析尽可能的简单。它还提供一个基于jQuery开发的前台来方便安装和运行(http://www.open-open.com/open305825.htm)
  3-2、Saiku是一个模块化的开源分析套件,它提供轻量级的OLAP(联机分析处理),并且可嵌入、可扩展、可配置(http://www.oschina.net/p/saiku)


4 运维工具:http://www.opstool.com/category/35

5 Transwarp Data Hub(简称TDH)是国内首个全面支持Spark和Hadoop2的一站式大数据平台
   http://www.transwarp.io/product.html
6\activemq 很好的参考资料
   http://shensy.iteye.com/blog/1752529


7\ hbase rowkey 设计(三维有序) 
http://ronxin999.blog.163.com/blog/static/4221792020130109202973?suggestedreading

8\ hadoop-lzo
https://github.com/twitter/hadoop-lzo/releases
https://github.com/google/snappy

9\Hadoop/Hbase的Snappy安装 .

http://blog.csdn.net/lixucpf/article/details/18314409


10\关于major compact
Here is some help for this command:
          Run major compaction on passed table or pass a region row
          to major compact an individual region. To compact a single
          column family within a region specify the region name
          followed by the column family name.
          Examples:
          Compact all regions in a table:
          hbase> major_compact 't1'
          Compact an entire region:
          hbase> major_compact 'r1'
          Compact a single column family within a region:
          hbase> major_compact 'r1', 'c1'
          Compact a single column family within a table:
          hbase> major_compact 't1', 'c1'



土木工程网

11\分词器
   11.1 ansj


/////////////////////////////////////
爬虫地址
http://blog.chinaunix.net/uid-22414998-id-3774291.html

hadoop2.0相关
1、安装配置 http://www.kankanews.com/ICkengine/archives/86515.shtml

solr 与hadoop
1\solr4.4+hadoop2.1.0beta整合http://www.kankanews.com/ICkengine/archives/42064.shtml
深入剖析SolrCloud(四)solrcloud 与replication  http://www.blogjava.net/wangxinsh55/archive/2012/07/04/382220.html
http://wiki.apache.org/solr/SolrCloud

1\分词
https://github.com/ansjsun/ansj_seg/

2\导数进入solr
http://wiki.apache.org/solr/DataImportHandler#Configuration_in_data-config.xml
3\jvm 优化
利用Arena Allocation避免HBase触发Full GC http://kenwublog.com/avoid-full-gc-in-hbase-using-arena-allocation
4\
https://github.com/NGDATA/hbase-indexer/wiki/Requirements
https://github.com/NGDATA/hbase-indexer/wiki/Installation

猜你喜欢

转载自sealbird.iteye.com/blog/2032714