1、HDFS
In order to 分布式存储
provide a file system, it is
optimized for storing large-size files without random reading and writing of files on HDFS. The file is
directly used. The
data model is inflexible
. The file system and processing framework are
optimized.
2、HBase
Provide tabular 面向列
data storage
. Optimize the random read and write of tabular data.
Use key-vale
operational data.
Provide a flexible data model.
Use tabular storage, support MapReduce, and rely on HDFS to
optimize multiple reads, and multiple writes
are mainly used to store non- Structured data and semi-structured data
3、MySQL
Traditional relational database
focuses on relationships and
supports transactions
4 、 Redis
Distributed caching is
based on 内存
,
emphasizing caching,
supporting data persistence,
supporting transaction operations,
NoSql type Key/value database,
supporting rich types such as List and Set
5、Hive
Hive 基于Hadoop的数据仓库工具
can map structured data files to database tables
and provide SQL functions. SQL can be converted into mr tasks to run
SQL. The learning cost is low, and there is no need to develop mr applications.
十分适合数据仓库的统计分析