2022 Market Research Report on China's Hucang Integrated Platform|Ai Analysis Report

Abstract
In order to meet the needs of data applications, the big data platform architecture continues to evolve, going through two stages: data warehouse and data lake. In 2020, the concept of integration of lake and warehouse is proposed. The integrated architecture of lake and warehouse will become the mainstream architecture of big data platform because it can realize unified management of data assets, reduce data redundancy, and reduce the complexity of operation and maintenance of big data platform architecture.
According to the research conducted by Aianalysis, the software market size of China's lake-cang integration platform in 2022 will be 1.52 billion yuan. Among them, Kejie Technology accounted for 11.1% of the market, Huawei Cloud accounted for 9.5%, and Transwarp Technology accounted for 7.3%. Aianalysis predicts that in 2025, the software market scale of China's lake-warehouse integrated platform will reach nearly 10 billion yuan, and the three-year compound growth rate from 2022 to 2025 will be 86%.

As an advanced architecture integrating lakes and warehouses, the integration of lakes and warehouses has obvious advantages in ACID transactions, separation of storage and calculation, integration of batch and flow, unified management of metadata, etc., and will become the mainstream technology for realizing the integration of lakes and warehouses in the future.
To succeed in the highly competitive lake-warehouse integration market, manufacturers should focus on building capabilities such as cloud native, lake-warehouse integration, Data Fabric, and DataOps. The above capabilities will constitute technical barriers to the lake-warehouse integration platform software. In terms of layout, it is necessary to focus on the financial industry, while paying attention to potential industries such as industry and transportation.
01 Introduction to Lake-warehouse Integrated Architecture
1.1 The big data platform architecture has entered the era of Lake-warehouse integration.
With the continuous expansion of data volume, the diversification of data types, and the deepening of digital transformation into enterprise data application scenarios, it has become increasingly complex, and business requirements for real-time data need to be Enterprises have both batch processing and stream processing capabilities; complex business types require enterprises to have descriptive analysis, predictive analysis, diagnostic and decision-making analysis, and exploratory analysis at the same time. Compared with data warehouses and data lakes, the integration of lakes and warehouses can fully meet the various needs of digital transformation enterprises for big data platforms. This also indicates that the big data platform architecture has officially entered a new era of integration of lakes and warehouses after going through the two stages of data warehouse and data lake.
Chart 3: Driving factors for the evolution of big data architecture
[picture]
1) Enterprise data analysis requirements promote the emergence of data warehouses In the
1990s, in order to meet the needs of enterprises for agile data analysis, the concept of data warehouses based on online analytical processing (OLAP) began to emerge and rapidly develop. Data warehouse can solve the integration and analysis problems of scattered data sources, and it has been widely used as the first generation data analysis platform.
Chart 4: Schematic diagram of a data warehouse
[picture]
2) The analysis requirements of massive heterogeneous data prompt big data platforms to enter the data lake stage
In the Internet era of the 21st century, new applications such as social media and search engines emerge one after another, bringing drastic changes to data application scenarios. The volume of data has increased from GB to TB and PB. The scalability of the original big data platform architecture is far beyond Meet computing needs. At the same time, the amount of unstructured data such as text, images, and voice is increasing rapidly, and the low-cost storage of heterogeneous data poses new challenges to big data platforms. The data lake realizes the unified storage, management and analysis of data in any format at a very low cost, and is especially suitable for advanced analysis scenarios such as data mining, forecasting, and recommendation, and has begun to be widely used.
Chart 5: Schematic diagram of the data lake
[picture]
3) In the era of digital transformation, unified management of data assets and shared services promote the integrated development of lakes and warehouses.
In practice, the data lake itself has obvious limitations, such as the data lake’s limitations on SQL standards and ACID characteristics. The support is poor, data quality is difficult to guarantee, data version control and indexing functions are insufficient, and it is difficult to complete the integration of batch processing and streaming jobs, making it difficult for data lakes to completely replace data warehouses. Enterprise-level application scenarios of agile analysis are still based on data warehouses Mainly.
In the era of digital transformation, enterprises need to adopt a new architecture to realize low-cost storage and efficient analysis of massive heterogeneous data while realizing unified management of data assets and shared services.
In order to combine the characteristics of low-cost storage of data lakes and efficient analysis of data warehouses, enterprises try to build data warehouses and data lakes separately through loose coupling. Data lakes store all data centrally, and data warehouses mainly store structured data. Under this architecture, data needs to be backed up multiple times between the two architectures, resulting in problems such as data islands, storage redundancy, difficult development and maintenance, and long data response cycles.
In 2020, Databricks first proposed the concept of "integrated lake and warehouse", which is a new paradigm that combines the advantages of data lakes and data warehouses. During the same period, domestic technology manufacturers began to explore the practice of integrating lakes and warehouses. In this report, the integration of lakes and warehouses refers to a new architecture system that combines data lakes and data warehouses, which can realize the unified storage, calculation, development, management and service of massive heterogeneous data, support multiple advanced analysis engines, and break data islands for enterprises , Improve data application value. The integration of lakes and warehouses can effectively solve data islands, reduce data storage redundancy, and reduce the difficulty of system maintenance. It is a new data architecture upgraded on the basis of data warehouses and data lakes. In the future, it will be widely adopted by large enterprises to improve data productivity and help Digital transformation and upgrading.
1.2 Two routes to realize the integration of lake warehouses: building warehouses on the lake and integrating lake warehouses
The industry explores the integration of lake warehouses with two routes: building warehouses on the lake and integrating lake warehouses, and the integration of lake warehouses represents the future trend.
1.2.1 Warehouse on the Lake
Figure 6: Schematic diagram of warehouse architecture on the lake
[Picture]
Warehouse on the Lake realizes the combination of data lake and data warehouse to a certain extent. In this architecture, multi-source heterogeneous data is first integrated and stored in the data lake by ETL, and then transferred to the data warehouse by ETL to support data analysis; it also supports data science, data mining, machine learning, deep learning, etc. Access to various computing and analysis engines.
However, building warehouses on the lake has not completely solved the problems of data consistency and data redundancy, and has not truly realized unified data management. Moreover, the form of ETL data from data lakes to data warehouses has also brought complexity to ETL. The insufficiency of the warehouse building architecture on the lake is reflected in the following aspects:
Unreliable data quality: data is transferred from the data lake ETL to the data warehouse, and the data consistency between the data lake and the data warehouse needs to be processed by multiple streaming engines. Compared with the traditional data warehouse Batch processing, the operation complexity is greatly increased, the reliability is difficult to guarantee, and data consistency problems are easily generated.
The full amount of data does not support ACID transactionality: most of the data in the data lake is still unstructured data, the data warehouse does not support the governance of the data lake data, and the data lake data still does not support ACID transactionality.
Data redundancy has not been eliminated: warehouse building on the lake is essentially a two-tier structure of data lake and data warehouse. The same data is still stored in the data lake and data warehouse in different modes, and data redundancy has not been completely eliminated.
The data warehouse does not support computing engines such as machine learning and data mining: Machine learning and deep learning frameworks represented by TensorFlow and PyTorch need to use non-SQL codes to process large data sets, and cannot directly access the internal data format of the data warehouse, so they are not suitable for data warehouse system.
1.2.2 Lake warehouse fusion
Chart 7: Lake warehouse fusion architecture diagram
[picture]
As shown in the figure, lake warehouse fusion combines the advantages of low-cost storage of multi-type data in data lakes with the efficient analysis capabilities of data warehouses, and realizes metadata unification through the transaction layer , Completely eliminate data islands and data redundancy, support multiple workloads with a set of data in real time and accurately, and accelerate data sharing and value mining.
Hucang Fusion adds a transaction layer to the data lake. The transaction layer has functions such as transaction management, unified metadata, indexing, transaction version and status control, data directory, and support for lake table formats. The transaction layer supports data users to flexibly read multiple types of data, and supports multiple computing engines such as BI, visualization, data science, and machine learning with one piece of data. The transaction layer enables the data lake to have ACID transactionality on the basis of unified management of structured data, semi-structured data, and unstructured data. At present, data lake solutions such as Delta Lake, Apache Iceberg, and Apache Hudi have all realized the transaction layer on the data lake.
Driven by the demand for real-time data analysis, volume of data analysis, and dynamic expansion of resources, the separation of storage and calculation and the integration of batch and flow have also become necessary functions for the integration of lakes and warehouses.
In addition, for multi-data source systems, Hucang Fusion can also realize unified management of multi-source system data assets through data virtualization.
Based on the differences in digital maturity and application scenarios of enterprise users at home and abroad, the product functions of Hucang Fusion at home and abroad are slightly different.
Foreign lake-warehouse integration manufacturers focus on the realization of vertical technical capabilities. For example, Databricks focuses on the realization of the underlying lake-warehouse integrated architecture and support for machine learning. In data management and data applications such as data quality, data governance, and data indicators, etc. Open data API, realized in cooperation with ecological third parties.
Compared with the vertical technical capabilities of foreign companies, domestic lake warehouse integration manufacturers have more obvious platform attributes. For example, Kejie Technology has a one-stop data platform capability based on the integration of lakes and warehouses, providing full-link data capabilities from data collection, data development, data governance, data asset management, data modeling and analysis to data services, which is more suitable for the needs of domestic enterprises .
1.2.3 Lake-warehouse integration is the future of the lake-warehouse integrated architecture
Analysis believes that in the era of digital transformation, lake-warehouse integration can more effectively meet the complex needs of enterprises for multiple application scenarios, and has become the mainstream technical route of the lake-warehouse integrated architecture. The specific reasons are as follows :
1) Hucang Fusion has obvious advantages in data calculation, data management, and data application, which can better meet the unified management needs of enterprise data assets
. Management, completely eliminate data redundancy, has obvious advantages in ACID transactional, storage and calculation separation, batch flow integration, real-time analysis, etc.; data management: use
a set of data support to realize data engineering DataOps and advanced data management concept Data Fabric;
In terms of data application: a set of data supports data application scenarios such as BI, visualization, data science, and machine learning, and realizes multi-scenario fusion analysis.
Chart 8: Functional comparison of warehouse building on the lake and fusion of warehouses on the lake
[Picture]
2) Fusion of warehouses on the lake reduces data migration risks and costs, and provides a mature solution for the digital transformation of large and medium-sized enterprises
In the process of digital transformation, large and medium-sized enterprises have formed a complex architecture in which multiple systems such as data lakes, data warehouses, dedicated databases, cloud storage, big data platforms, and streaming data processing platforms coexist. The existing system of the enterprise is deeply integrated with the business, and the operation is stable, and there is still room for potential utilization of system performance. The enterprise hopes to continue to reuse the existing construction results. Building warehouses on the lake requires enterprises to migrate the data in the existing system into the new data lake, completely replace the existing data warehouse and data lake engine with the new lake warehouse engine, and then realize the unified storage, development and management of data based on the lake warehouse integration . This will not only bring huge migration costs and data migration security risks, but also means that enterprises need to abandon the old data warehouse, data lake and other architectures, and the original architecture efficiency will be wasted.
In contrast, the fusion of lake warehouses can logically realize the unified organization, management and sharing of databases, data warehouses, data lakes and cloud data through data virtualization, reducing data migration risks and migration costs. 3) The standards of the lake-warehouse integration industry group have been initially established, and the market definition and practice path have gradually formed a consensus, which will accelerate the commercialization process of the lake-warehouse
integration.
"The research and development of the main content clarifies the five major capability domains of the Hucang integration, and provides evaluation criteria for the Hucang integration manufacturers and enterprise users to evaluate the R&D direction and technical capabilities of the Hucang fusion platform products. The establishment of the lake warehouse integration standard will standardize market competition and accelerate the commercialization of lake warehouse integration.
02 Market scale of lake-warehouse-integrated platform software
2.1 Definition of lake-warehouse-integrated platform software
China's big data IT investment includes three parts: hardware, software and services. The software part refers to the big data platform software. According to different engines, the big data platform software can be divided into a data lake engine and a lake warehouse integrated engine.
In this report, the big data platform software implemented based on the lake-warehouse integrated engine architecture is defined as the lake-warehouse integrated platform software.
2.2 China's Lake-warehouse Integrated Platform Software Market Size
Chart 9: Lake-warehouse Integrated Platform Software Market Size and Growth Rate
[Picture]
Ai Analysis estimates that the market size of the lake-warehouse integrated platform software will be 1.52 billion yuan in 2022, and the compound growth rate will be 86% in the next three years. It is estimated that the market size will reach nearly 10 billion yuan in 2025.
The big data platform software market continues to grow, and the integrated engine of lake and warehouse is rising rapidly. Since 2022, the national level has successively issued documents such as "Opinions on Building a Data Basic System to Better Play the Role of Data Elements" and "Guidelines for the Construction of a National Integrated Government Big Data System". The height will drive the rapid growth of the big data platform market. According to IDC data, the compound growth rate of China's big data platform software market from 2022 to 2026 is close to 28%. The compound growth rate of the lake-warehouse integration is 86%, which is far higher than the growth rate of the big data platform, which indicates that the lake-warehouse integration, which represents more advanced technical capabilities, will usher in rapid development, and the lake-warehouse integration will become the mainstream engine in the future.
The enterprise's original big data platform architecture capabilities are not enough to meet the needs of enterprises in the digital age, which has accelerated the penetration of the integrated lake and warehouse engine. In the stock market, the advanced technology of Hucang integration determines that it can smoothly replace the enterprise data warehouse architecture and data lake architecture, and continuously increase the proportion of Hucang integrated engine in the enterprise big data platform. In the incremental market, starting from data management costs and operation and maintenance costs, many companies are more willing to directly adopt a new big data platform architecture such as a lake warehouse integrated engine.
2.3 Capacity Requirements for Software Vendors of the Lake Storage Integrated Platform According to
the "Technical Requirements of the Cloud Native Lake Storage Integrated Data Platform" issued by the Academy of Information and Communications Technology, the capabilities of the cloud native storage storage integration platform are divided into lake storage data integration, storage storage, computing, and data management. As well as other capabilities of Hucang and other five major capability domains, it has the characteristics of storage and calculation separation, storage classification, elastic capability, multi-scenario fusion analysis, multi-computing mode support, and unified metadata management.
With reference to this requirement and combined with research, Aianalysis summarizes the basic capabilities that software vendors of the lake-cang integration platform should have as follows: 1) Separation of storage and computing: With storage and
computing separation technology, storage resources and computing resources can be independently expanded and managed flexibly , On-demand scaling.
2) Batch-flow integration: supports batch-flow integration, realizes multi-modal data fusion and real-time analysis, and improves data analysis efficiency.
3) ACID transactionality: It has a complete ACID transaction mechanism, supports atomicity, consistency, isolation, and persistence, and ensures the consistency of different users querying and calculating a piece of data.
4) Unified metadata management: Based on metadata management standards, the metadata of the data lake and data warehouse are collected in a unified manner to form a unified metadata directory.
5) Multi-mode data storage and storage classification: the platform supports HDFS file storage and S3/OSS object storage, and supports multi-mode data storage such as structured, time series, documents, and images in a unified lake table format, which can be stored in cold and hot levels as needed. Data can flow freely between the data lake and data warehouse.
6) Support multiple computing engines: built-in engine routing capabilities, support offline computing engines, real-time computing engines, interactive query engines and other engines, and support machine learning and deep learning frameworks, providing multiple computing environments for data integration and development , for customers to choose on demand.
7) Multi-scenario fusion analysis: Supports analysis of application scenarios such as BI, visualization, data science, and machine learning.
8) DataOps: Provide comprehensive software engineering and data management components and tools. Software engineering includes data collaboration, data development, data deployment, orchestration, test monitoring, etc. Data management includes data acquisition, data integration, data preparation, data governance, data Modeling, etc., to improve the efficiency of data management, data application, and data development collaboration.
9) Data Fabric: Support logical unified management of scattered and multi-source data infrastructure data through data virtualization, form complete data assets, support data business expression, convert data into business-understandable indicators and labels, and meet Business needs to use data to accelerate data value mining.
2.4 The panorama of the software manufacturers of the integrated lake-warehouse platform
Figure 10: The panorama of the software manufacturers of the integrated lake-warehouse platform
[Picture]
Under the trend of integrated storage and storage, mainstream manufacturers have launched integrated products or solutions for storage and storage. The types of mainstream vendors cover a wide range, involving cloud vendors, database vendors, data warehouse vendors, data middle-end vendors, and big data basic software vendors. The active research and development investment of mainstream manufacturers has verified the trend and future of lake warehouse integration, but the realization route of different products (building warehouses on the lake or integrating lake warehouses), product performance and stability, and the maturity of solutions are all waiting for customers and customers. A long-term test of the market. At present, there are obvious differences in the path and industry layout of various types of manufacturers to realize the integration of lakes and warehouses:
1)
The realization path of cloud manufacturers: based on their own IaaS cloud basic capabilities, build a cloud ecology or integrated software and hardware solutions for lakes and warehouses.
Industry layout: relying on cloud capabilities to provide lake warehouse solutions for government affairs and Internet companies, such as Huawei focusing on government affairs, and Alibaba Cloud for the Internet.
2) Database and data warehouse manufacturers
Realization path: Based on a single technical route, realize the integration of lakes and warehouses on the basis of their own databases and data warehouses.
Industry layout: priority is given to providing services in in-depth industries of databases and data warehouses such as finance and government.
3) Realization path for data center service providers
: provide integrated lake warehouse architecture at the computing engine layer of data center platform, form data assets through data integration, data development, and data governance, and provide structured and unstructured lake warehouses for data consumers data.
Industry layout: Give priority to serving group enterprises in new economic industries such as retail consumption, advanced manufacturing, and biomedicine, as well as some financial enterprises.
4) Big data basic software vendors
Realization path: adopt the technical architecture of lake warehouse integration, independently develop the data storage computing engine, be compatible with upstream database, data lake and downstream data middle platform application system, and provide an open lake warehouse integrated engine. For example, the KeenData Lakehouse integrated basic data base, on the one hand, is compatible with multi-source heterogeneous systems such as managed databases, data warehouses, and data lakes for the upstream; The data engineering system supports data low-code development; the integration of data governance and data engineering can realize active and real-time data governance such as active metadata exploration, data lineage analysis based on AI enhancement; and provide data services such as smart indicators and smart labels based on data virtualization expression ability.
Industry layout: The independent engine provides compatible lake warehouse capabilities, which can be implemented in the whole industry such as finance, government, energy, retail, and automobile.
2.5 Market share of lake-warehouse integrated platform software
Chart 11: Market share of lake-warehouse integrated platform software in 2022
[Picture]
In 2022, in the lake-warehouse integrated platform software market, Kejie Technology’s market share will account for 11.1%, ranking No. one. The market shares of Huawei Cloud and Transwarp Technology accounted for 9.5% and 7.3% respectively, ranking second and third respectively.
KeenData Lakehouse, an integrated lake and warehouse product of Kejie Technology, integrates the concepts of integrated lake and warehouse, DataOps, and Data Fabric to provide enterprises with a one-stop data base platform construction service covering the data life cycle. The service scope of Kejie Technology radiates domestically and the Asia-Pacific region, covering finance, industry, energy, automobile, retail and many other industry leading customers, such as China Unicom, Sinopec, China FAW, State Grid, Geely Automobile, CICC, Yong Wang Group and other companies.
FusionInsight, Huawei's cloud, lake, and warehouse integration product, provides customers with a complete portfolio of big data cloud services. Based on the advantages of cloud computing software and hardware integration and dedicated cloud services, FusionInsight has been widely used in government affairs, finance, communications, transportation and other industries.
Transwarp's lake-warehouse integrated product big data basic platform TDH has the characteristics of cloud native, multi-modal heterogeneous storage, 1 lake and N warehouse multi-tenant system, and independent controllability. Fields such as finance and government affairs have accumulated rich customers.
2.6 Introduction of Representative Manufacturers
2.6.1 Kejie Technology
Kejie Technology is a leading big data & AI technology innovation company in China, focusing on providing big data basic software product services in complex scenarios; the core technical team comes from the big data of leading Internet companies The basic technology research department is committed to providing enterprises with a complete set of integrated solutions for data storage and computing engines, data management, development and mining, and operation and maintenance through the R&D and innovation of basic software capabilities, helping enterprises to quickly build data capabilities and achieve highly standardized and agile solutions. Data work collaboration and data application innovation.
KeenData Lakehouse, KeenData Lakehouse, the core product of Kejie Technology, is a data base product independently developed based on cloud native technology. It provides end-to-end one-stop big data basic software solutions to help enterprises realize the transformation from IT to DT. The product applies a number of leading technologies including Data Fabric, Active Metadata Management, Data Mesh, etc., integrates the concepts of DataOps and Observability, and makes data development IDE, process, collaboration and automation, while product engineering capabilities and governance The design of integrated capabilities can greatly reduce the engineering difficulty of IT technicians and semi-IT technicians, and improve the self-governance ability of enterprises.
Chart 12: KeenData Lakehouse technical architecture of KeenData Lakehouse integrated data intelligence platform of Kejie Technology
[Picture]
2.6.2 Databricks
Databricks is a world-leading big data company founded in 2013 by the original creators of Apache Spark, Delta Lake and MLflow . Databricks builds a Lakehouse architecture on the cloud, combining data warehouses and data lakes to provide an open and unified platform for data and AI.
Databricks lake warehouse integrated platform software includes core functions such as Delta Lake, data science workspace, machine learning, SQL analysis, and security management. Delta Lake is an open-format storage layer that can implement transactional and data version control, form a unified metadata directory, and provide abnormal data in a unified Parquet data format for various APIs and engines to call. The data science workspace supports Notebook modeling and supports SQL and Spark task development. Machine learning provides an integrated machine learning environment that provides data engineering capabilities such as data exploration, management and governance, and feature engineering to simplify the ML development process. SQL Analytics enables enterprises to run data warehouse workloads on data lakes. For security, Databricks provides role-based access control.
Chart 13: Databricks lake warehouse integration architecture diagram
[Picture]
Databricks and Kejie Technology adopt the same lake warehouse integration technology architecture. Both products cover storage and calculation separation, batch stream integration, ACID transactional, Data Fabric and other features. The difference is that Databricks makes full use of cloud ecological tools, relies on the operation and maintenance capabilities of cloud services, and supports customers to customize rich scene construction and expansion, which has relatively high requirements for customers; Kejie Technology provides standards for the implementation of enterprise-level DataOps and Data Fabric With its capabilities and solutions, customers can rely on KeenData Lakehouse to quickly realize scene construction.
03 Suggestions for warehouse-lake integrated manufacturers
3.1 Focus on the integration of cloud-native, DataOps, Data Fabric and the integrated lake-warehouse platform
3.1.1 Cloud-native technology can greatly release the value of the integrated lake-warehouse platform
Cloud-native is a new IT technology System, including key technologies such as containers, Kubernetes, microservices, service mesh, DevOps, and observability. Ai Analysis believes that cloud-native technology has the characteristics of loose coupling, automation, flexible scheduling, on-demand allocation of computing resources, and high fault tolerance, which can greatly release the value of the integrated lake and warehouse platform. The specific reasons are as follows:
1) The core key components are containerized and packaged to improve the efficiency of deployment and delivery, and can more flexibly meet the business needs of different enterprises.
2) The separation of storage and calculation reduces the cost of data storage and improves the efficiency of data calculation.
3) Automated orchestration and scheduling to reduce the operation and maintenance cost of the integrated platform of lake and warehouse.
3.1.2 DataOps and Data Fabric can amplify the application value of the lake-warehouse integrated platform and enhance the competitive advantage of the lake-warehouse integrated platform
. Crucially, both DataOps and Data Fabric are practical approaches to address the above issues.
DataOps is an advanced data engineering concept, covering the whole process of data acquisition, data integration, data preparation, data governance, data analysis and modeling, etc., providing functions such as data collaboration, data development, data deployment, orchestration, testing and monitoring, etc. Improve data development and management efficiency.
As a cutting-edge data management concept, Data Fabric can realize active and real-time data governance, and change the traditional and passive data centralized management and control and centralized governance of enterprises. For example, the metadata active discovery function of Data Fabric can automatically detect changes in data sources. And notify the downstream of the algorithm and model used for the data, or judge the predictability of the data in advance and give a warning, and strengthen the data standard, master data management, data quality, data asset catalog and other functions in the integration of lake and warehouse.
In particular, Hucang Fusion has realized heterogeneous system data collection based on data virtualization technology, as well as full data modeling, data governance and other functions, which is the phased implementation of the Data Fabric concept. In the future, Hucang Fusion will promote Data Fabric technology in the enterprise rapid landing application.
3.2 Focus on finance and pay attention to potential industries such as industry and transportation
The penetration of the integrated structure of lake and warehouse in the financial industry is accelerating, and the awareness of other industries needs to be improved. The financial industry has a strong demand for multi-source real-time data analysis scenarios, such as real-time analysis of intelligent marketing, intelligent risk control, and customer behavior. The digital transformation process of the financial industry is leading, so it is also the first industry among various industries to realize the integration of lakes and warehouses. At present, the integration of lakes and warehouses in the financial industry has penetrated from state-owned commercial banks to joint-stock commercial banks and local commercial banks. Large commercial banks represented by Bank of China, China Construction Bank, and China Everbright Bank have completed the construction of integrated lakes and warehouses. Small and medium-sized City commercial banks have also carried out data architecture upgrades one after another, adopting and building an integrated lake-warehouse architecture. The value of the lake-warehouse integrated architecture in the financial industry has been verified. Under the influence of industry benchmarks, the construction of the lake-warehouse integrated architecture in the financial industry will maintain a high-speed development trend in the next 3-5 years.
Other industries such as industry, transportation, government affairs, retail and other industries are slightly behind the financial industry due to the digital construction process, and demand will explode in the next 2-5 years.
Industrial enterprises have clear requirements for the integration of lakes and warehouses. In the IoT environment, the data volume of industrial enterprises is large and continues to grow, and the low level of dataization of enterprises themselves leads to problems such as difficulties in data collection and aggregation, low levels of data governance, low data utilization, and difficulty in data circulation and sharing. At the same time, under the fierce competition environment, the demand for real-time data analysis of industrial enterprises is growing rapidly, such as real-time monitoring of production process, real-time forecasting of customer demand, etc. The characteristics of data integration, storage and calculation separation, and batch flow integration will provide simple and convenient solutions for enterprises to implement the Industrial Internet. Under the promotion trend of the industrial Internet, Aianalysis predicts that the demand of industrial enterprises for the integration of lakes and warehouses will explode within 2-3 years.
Leading enterprises in the field of transportation have a strong demand for integrated lake and warehouse architecture. Under the trend of smart transportation, large-scale transportation companies have started to build a cross-regional unified command and dispatch cloud platform to support emergency management, real-time command and dispatch, etc. In order to achieve platform construction, it is necessary to integrate multi-source data such as transportation management, public management, railway, aviation, traffic police, tourism, and weather for fusion and real-time calculation. The integration of lakes and warehouses meets the needs of transportation companies and will speed up its implementation in the transportation field. Ai Analysis predicts that it will take 3-5 years to realize the integration of large-scale storage and storage in the transportation field.

おすすめ

転載: blog.csdn.net/weixin_45942451/article/details/132041318