The building blocks of the big data technology architecture

Big data is not a new thing that has only appeared in recent years. It is only in recent years that it has become really popular and popular! This is due to the rapid development of Internet information technology, the network has changed the world and life, and the application of big data technology has made such changes more profound.

Anyone who pays attention to big data or Internet news should know that big data has risen to the height of national strategy. It can be said that this is an inevitable trend of the development of the times. To promote the popularization and application of big data technology from the national strategic level, a crucial and very core issue —the issue of data security is very prominent. To solve the problem of data security, we must return to the framework used for big data development!

 

The development of big data in China started later than in foreign countries, and all the standards and rules for the development of big data are based on the foreign set. Most of the commercial releases launched by domestic companies or institutions that develop big data are secondary packaging of open source programs, and very few are engaged in the underlying development of big data. For those who do the original ecological development of big data and launch a commercial release version, the industry will only have Da Kuai search, and there may be the emergence of original ecological development of big data in the next three to five years.

The reason why the popularity of big data is not high is mainly because the application and development of big data is too biased towards the bottom, the difficulty of learning is not generally great, and the technical scope involved is too broad, which is beyond the control of ordinary people. Most of the Hadoop domestic distributions in the market are just taken from abroad and re-modified. DKhadoop encapsulates some common and reusable basic codes and algorithms in big data development into class libraries, which greatly reduces the difficulty of development. I believe this is easier to understand for those who are engaged in development.

Below, I will introduce you to the module components of Dakuai's big data development framework:

The big data integration development framework of Dakuai is mainly composed of six parts: data source and SQL engine, data acquisition (custom crawler) module, data processing module, machine learning algorithm, natural language processing module, and search engine module.

 

If the Dakuai development framework is deployed on the open source big data framework, the components of the platform need to be supported as follows:

Data source and SQL engine: DK.Hadoop, spark, hive, sqoop, flume, kafka

Data collection: DK.hadoop

Data processing module: DK.Hadoop, spark, storm, hive

Machine Learning and AI: DK.Hadoop, spark

NLP module: upload server-side JAR package, directly support

Search engine module: not independently released

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325634733&siteId=291194637