MySQL ruthless rejection of Oracle ranked Top1, most private cloud reuse, big data talent shortage! | Chinese big data applications annual report ...

Finishing | Tu Min

Exhibition | CSDN (ID: CSDNnews)

Technology river, Shun prosper, missed perish. Among the attitudes of this technology, professional Chinese IT community CSDN Jiang Tao, founder & chairman has repeatedly said in public activities, the developer is the most sensitive to technological change crowd. This not only from developers, engineers create a power tool in this era of transformation, they are also very forward-looking to create a virtual, digital world outside of the real world.

Currently, AI, networking, cloud computing, the next technology giant net woven big data, etc., I do not want to miss new opportunities, perhaps we can learn from twelve offices are already in the human point of view and the overall trend.

CSDN earliest since 2004 for the development of China large-scale survey, covering by far the largest number of all types of developers crowd, radiation areas, industries most widely distributed of investigation. In the "2019-2020 China's developers report" in facing the digital world with the super computing power, we conducted a "big data technology application status analysis" and found:

2017, big data business surge, there are 81% of companies are using big data technology for application development;

Big Data industry, a high proportion of small and medium enterprises;

Private cloud solutions are the main form of large enterprise data platform built;

With big data, developers can achieve more intelligent decision-making;

The main difficulty faced by large enterprise data is "big data application planning";

The era of big data, enterprise data mainly from within the enterprise;

Hadoop developer community released version of the most popular;

Spark is the most widely used platform for big data components;

And Kafka Redis are message queues and data capture techniques most commonly used components.

Big Data era has been to!

Under the overall cloud era, as Professor Victor co-author of "big data era" shared, the true value of Big Data is like an iceberg floating in the ocean, you can only see the tip of the iceberg at first glance, the vast parts are hidden beneath the surface. And explore the value of the data, to conquer the sea of ​​data "power" is the cloud computing.

In the past few years, a lot of people to witness and experience this. According to survey data, we found that multiple factors driving government policy, and the development of sophisticated algorithms, block chain, such as cloud computing and other technologies, since 2017, the application of big data technology enterprises have rapid growth, as of now, 81% of companies are already using big data technology for application development.

For developers / companies, big data brought positive obvious. The research results show that 64% of the developers said large data application enables a more intelligent decision-making; secondly, represents 54% of the developers to enhance operational efficiency; 29% of the majority of its product developers said dependent operational decisions or A / B testing.

However, data from large enterprises have been engaged in, we found that 78% of large enterprise data team size of 30 people or less , of which  5 persons sized teams accounted for 37% , while  just over 100 large-scale team 5%.

Data from a large-scale enterprise team point of view, this means that a lot of businesses to invest in this area is still in the Preliminary or just the initial stage.

Companies under plight of the cloud era

Chinese saying goes, "All things are difficult." For large enterprise data just at the first attempt, in the face requires powerful computing, analysis, processing power, everything is more difficult to start. In the survey data in the report, which has also been confirmed. In many developers / take the first step in the company, 56% of respondents said, "How big data applications planning" a major difficulty they face, but also the biggest obstacle to large enterprise data applications landing. In addition, the enterprise capable of this work is relatively lack of talent, this survey data also shows that, in the absence of big data talent is one common problem encountered when companies build big data applications.

In fact, on this point, the industry is not less leader have been offering win-win cooperation program designed to help more peers who build, share, share perfect technology ecosystem. Huawei, for example, Huawei will be "Kunpeng + rising" for the base, invested 10.5 billion yuan ($ 1.5 billion) over the next five years to build, "Huawei Kunpeng ecology." Based on Kunpeng ecological, businesses can quickly get started IT infrastructure and business applications based on Huawei's Kunpeng and rising processors, including the PC, server, storage, Caozuojitong, middleware, virtualization, databases, cloud services and industry applications in big data and artificial intelligence scene, to play its architectural advantages, multi-operator releases the force.

At present, Huawei's cloud already has over 4000 Kunpeng ecological partners. In the 34 "new infrastructure" trillion investment wave, 5G, artificial intelligence, data centers and large areas of the Internet industry generated power calculation demand and domestic demand, so that Huawei Kunpeng ecology full of opportunities to attract more enterprise applications and SaaS service for Huawei Kunpeng do compatibility adaptation.

A private cloud is the first choice of many enterprises

Under the information spurt outbreak, as cloud computing technologies mature and widely used, and for reasons of data security, many companies have chosen to deploy private cloud solutions for large data applications, which accounted in 2019 It reached 50%. In addition to security concerns, many companies are also based on the speed of deployment, flexible expansion, operation and maintenance process to select a private cloud. In addition, 28% of companies have chosen to independent research and development to build big data platform.

Large enterprise data platform to build Status

Based on the above, when some companies are working on big data application programming occasion, some companies began to realize some of the traditional scene data visualization. According to research data show that most companies use large data more reflected in the statistical analysis, reporting and data visualization on, accounting for 56%, compared to the traditional manual input statistical analysis, the application of big data in large the degree of improvement on the efficiency and reduces labor.

Secondly, on the machine or device data real-time monitoring, alarm and operation and maintenance management, large data applications are relatively wide, accounting for 33%. Subsequently, a large data also applies to portraits user modeling, personalization and recommendation in precision marketing, accounting for 29%.

Overall, the current enterprise big data scenarios still relatively simple.

Enterprise data mainly from within the enterprise log data, including system logs and logs of user behavior, according to this survey data show that this accounted for 60%, followed by, from a supplier or partner to provide data accounted for 37%.

Data on the size, according to survey data, 45% of enterprise data daily processing scale in 1TB or less, 1-10TB enterprises accounted for only 31% of the average daily handle 10TB of data in the following scale enterprises accounted for Qi Cheng.

This survey, 55% of enterprise data platform for large-scale clusters with more than 20 nodes, of which 5% of the enterprises cluster size of more than 5000 nodes.

Spark, Redis, Kafka into data technology developer favorite

According to the survey report shows that the developer is still the mainstream of technology-based.

Database: MySQL ranked Top1

The data organization, management, storage, most developers use MySQL database, according to research report shows that 83% of developers using MySQL database properties, which may be related to its open source. In contrast, 34% in the use of Oracle database, 28% of developers use Redis.

Framework: Hadoop community released version of the most popular

In the technology level big data platform, in addition to Java, Linux and other languages ​​and command, Hadoop big data framework is an important development, it is mainly in a reliable, efficient and scalable approach to data processing. In addition to Hadoop community release, as well as commercial distribution, mainly to provide a more professional support, which is more important for large enterprises. 

According to research report shows that only 19% of companies use commercial release version of Hadoop to build a data platform, the main business is the choice of a more formal release community, accounting for 34%. However, 32% of companies said they did not build a data using Hadoop platform.

Spark is the most popular big data platform components

As fast general-purpose computing engine designed for large-scale data processing and design, Spark is one of the skills Big Data Developers necessary, it can be run independently, can also run on Hadoop, Mesos, cloud, it can access a variety of data sources include HDFS, Cassandra, HBase and S3, Hadoop cluster can enhance the application running on the speed of memory and disk. Spark ecosystem in addition to the core API, also includes other additional libraries can provide more capabilities for large data analysis and machine learning.

The survey, Spark is the most widely used platform for big data components, the utilization rate reached 44%. The MapReduce usage is only 21%. HDFS distributed file system as one of the core components, the utilization rate reached 39%. Large enterprise data platform for most application scenarios statistical analysis, report generation and data visualization, 38% of companies use ELK (ElasticSearch + Logstash + Kibana) real-time log analysis platform.

Spark assembly, SparkSQL processing speed and can be fully compatible and Hive, 56% usage Spark discharge using the first component. Streaming, SparkR usage were 27%, 24%.

And Kafka Redis are message queues and data capture techniques most commonly used components

Message Queue Middleware is a distributed system, an important component, mainly to solve the application of decoupling, asynchronous processing, traffic clipping, news and communications issues. Kafka most widely used, accounting for 42%. Redis accounted for 38%. Followed by ActiveMQ, accounting for 28%.

Application developers under the overall cloud era

In the digital world, technology, regardless of family, as the beginning of the article, to explore the value of data, data to conquer the ocean "power" is the cloud computing. In terms of the cloud, according to research report, 34% of the developer used to develop the container, 33% of the developer container is not used technique. Developers cloud-based / browser IDE carry out the three elements of software development, the most common is the fast startup speed, ease of operation and can be comparable to desktop IDE.

When the use of domestic development of new infrastructure platforms (such as domestic AI chip, ARM server chips, etc.), if the commonly used software stack / open source components / foundation libraries / library acceleration corresponding lack of adaptation, 28% of developers and manufacturers will choose joint development of appropriate adaptation. In addition, in contrast to the ARM architecture CPU and x86 series CPU, many respondents chose the core causes of the ARM architecture, in addition to price, compatibility, 13% of developers think the ARM architecture with multi-core CPU Advantage.

Kunpeng processor at this point, Huawei released a high-performance, high-throughput, high integration and other features, but also on the basis of the ARM ecosystem for big data, distributed storage, database, native applications and cloud services and other advantages scene conducted in-depth optimization. Wherein the scene data in the large, multinucleated Huawei roc height matching the concurrent high capacity mass data processing requirements, can be improved by 30% in performance, while saving space and power.

How to seize the big data "vent"?

To sum up, the clock running, big data has been gradually extended from the concept to the scientific and commercial areas, and the trend of a variety of digital information, is no longer a single discipline. In this regard, Melbourne University lecturer Palace obviously Comments Road, "the current development boom in big data is encouraging. Enterprises are to truly benefit from the data, rather than blindly follow the trend, you first need to establish good big data talent team. As the saying will few but fine, a good team of big data, the need for high sensitivity at the same time have some understanding of the technical personnel for product development, also need very solid theoretical foundation, it can be drawn to the practical problems like modeling and algorithm design people. only two-pronged approach, a deep level exploration in products and technologies, in order to truly realize large numbers of prosperity, according to the industry. "

Want a comprehensive understanding of China's development, you can scan in the figure below two-dimensional code or click to read the original , to get the full report ????  

Scan QR code or click below "read the original" to read the full report immediately!

Copyright: "2019--2020 Chinese developers report" copyright belongs to CSDN, reproduced excerpts or use other ways to use this report should indicate the source text or opinions.

Released 1817 original articles · won praise 40000 + · Views 16,440,000 +

Guess you like

Origin blog.csdn.net/csdnnews/article/details/104832413