DKhadoop detailed explanation of hadoop big data platform architecture

The era of big data has come, and the explosive growth of information makes more and more industries face the challenge of storing and analyzing this large amount of data. Hadoop, as an open source distributed parallel processing platform, is more and more popular due to its advantages of high expansion, high efficiency and high reliability. This also led to the release of the commercial version of hadoop. Here, we will introduce the architecture of hadoop big data platform in detail through DKhadoop.

At present, the domestic commercial distribution of hadoop is not only Dakuai DKhadoop, but also such as Huawei Cloud. Although the issuers are different, they are similar in platform architecture. Here I will introduce dkhadoop, which I am familiar with.

 

1. Dakuai Dkhadoop can be said to integrate all the components of the entire HADOOP ecosystem, deeply optimize it, and recompile it into a complete higher-performance big data general computing platform, which realizes the organic coordination of various components. . Therefore, compared with the open source big data platform, DKH has a very high improvement in computing performance. This is also a point that I personally think dkhadoop is better than another commercial distribution I used before. Most of the domestic commercial distributions of hadoop can be said to be secondary packaging. What dkhadoop does well is to dare to develop on the basis of the original ecology. .

2. Dakuai DKhadoop middleware technology simplifies the big data cluster configuration into three types of nodes, which not only simplifies the management and operation of the cluster, but also enhances the availability and stability of the cluster. Dkhadoop middleware integrates many components of apache, including support from files, SQL, logs, messages to crawler and streaming data and heterogeneous data; it integrates fast compression algorithms, and data synchronization and distribution technology to realize data import It can be achieved at the same time as reducing mobilization, which has irreplaceable technical advantages for projects with real-time data requirements.

3. Dakuai DKhadoop commercial distribution still maintains the advantages of open source systems, and can be 100% compatible with open source systems. For those big data applications developed based on open source platforms, they can also run efficiently on dkhadoop without modification.

4. The DKhadoop integrated development framework provides more than 20 categories commonly used in big data, search, natural language processing and artificial intelligence development, with a total of more than 100 methods, which greatly improves development efficiency. DK.HADOOP integrates and integrates the NOSQL database, which simplifies the programming between the file system and the non-relational database; DK.HADOOP improves the cluster synchronization system, making the data processing of HADOOP more efficient.

5. The SQL version of DKhadoop also provides the integration of distributed MySQL, and the traditional information system can seamlessly realize the leap for big data and distributed.

6. ES: The search system of DKhadoop is secondary developed on the open source ES system and supports full-text search. A high-performance version that integrates effective support for Chinese search and support for Dakuai data synchronization technology. DK.ES is one of the core components of DKH. It only integrates effective support for Chinese search and Dakuai with DKH. A high-performance version supported by data synchronization technology, DK.ES is one of the core components of DKhadoop.

7. Chinese language processing component: Dakuai's Chinese language processing is currently the most widely used open source natural language processing development kit in China.

This is a brief introduction. If you want to know more, you can search for it or download the dkhadoop learning version. Here is the question about the dkhadoop version:

DKH Standard Edition   DKH-Distributed SQL Edition DK.HADOOP Distribution 

The DKH Standard Edition has three different sub-versions: a stand-alone version for development and debugging; a learning version that supports three nodes; a standard server version that supports more than 5 nodes

DKH-distributed SQL version has two sub-versions: learning version and server version

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325557516&siteId=291194637