greenplum + pgsql和Hadoop+hive+hbase

The architecture used in the project is greenplum+postgresql, Hadoop + Hive +hbase mode

 

 A. Hadoop + Hive 
supports new nodes, and there is no need to restart during the process.
Supports jdbc to access hive database
and supports sql to obtain data.
In the technical process, data is executed in batches. If the tez engine is set up in the Hadoop cluster, the calculation data will be greatly improved.
If you want to support ad hoc queries, you need drill, and the size of the impala component auxiliary
cluster can reach tens of thousands of
disaster tolerance. There
are few visualization tools. The commonly used hue, zepplin, etc.
can be fully integrated with other components of the current Hadoop ecosystem, and there are many flexible options.
The most robust open source ecosystem

B. GreenPlum + PostGreSQL
supports adding new nodes, but the
scale of the cluster that needs to be restarted in the process is rarely in the thousands, generally dozens or hundreds of them
support jdbc access to the database
. Support sql to obtain data,
support ad hoc query
and Current traditional BI tools can be well integrated

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326489118&siteId=291194637