Application of the depth of analysis and origin of things and big data analytics

Recently, many people discuss things and big data with me, but most of them still classified recognize these two technologies is not very clear. Here I combine some of our cases these two concepts do some elaboration.

Things is a complete concept, including not only the distal end of the sensor data collection, transmission, storage and display, the sensor further comprising analyzing the historical data acquisition, analysis and decision based on the results produced, and a feedback control operation. Compared with the traditional way of human cognition, things equivalent to enhance the people's "facial features" the ability to identify and enable people to get a lot of information had not directly acquired. And based on the data analysis of things, is equivalent to enhance the people's "brain" of awareness, people can get rid of the limitations of conventional thinking to achieve more dimensions, more comprehensive, more real-time awareness and assessment skills.

Big Data in the usual sense, refers to quantities of data are calculated. Due to limited original storage and computing power in recent years gradually developed a series of new technologies, including Hadoop, Spark, including for efficient, real-time processing vast amounts of data (bulk data-based), and this on the basis of the original data of some processing small data sets of mining technology, combined with big data to achieve for many business systems analysis of data (bulk data-based), such as for different groups of classification and labeling of portraits, and precision marketing . With the improvement of the real-time, in recent years flow calculation and analysis are also referred to a higher level, need to process all the time processing and analysis, time-stamped data, such as data or things log data.

As you can see, these two seemingly unrelated technology, is (a produce data, a data processing and analysis) through the data closely linked.

Distinguish IoT, IIoT and large industrial data

To be traced

Before further elaboration, I need to help you distinguish between several concepts.

First, the traditional distinction between things and things traditional industries of things aimed at consumers and smart city, etc., by adding a wide range of sensors collecting and transmitting real-time data of many decentralized, building real-time monitoring, display, alarm and historical data query capabilities; and industrial things, mainly refers to the data acquisition and control system through the existing industrial equipment (sensors rarely need to increase), on the basis of the alarm monitoring, through in-depth analysis of the data, to find improve equipment reliability, reduce abnormal, increase production and means operational efficiency.

Things traditional data analysis, and we streaming data analysis on the Internet, not very different, by treating the single indicator, generating a corresponding amount of calculation time average window, extremum, and calculate and display quantities.

Things then distinguish between industrial and industrial data on a large number of vendors and foreign media, and there is no concept of industry big data, more is to be merged in the industry of Things (IIoT) category, while both domestic as will two different categories, at the same time, both will be together, such as the production and supply chain systems, all integrated into the concept of the Internet industry to go. So, we can see that foreign analysts, including Gartner included, and there is no specific classification for the Big Data industry or the Internet industry, but there is a very detailed analysis of IIoT.

Traditional industries which, not without data processing. But the original data acquisition, data processing, data analysis and feedback, are scattered in different systems inside, on the one hand can not handle a flood of industry "big" data, on the other hand can not guarantee real-time. We often see industrial enterprises which, many analysts are forced to export data from different systems manually control some of the data file, marked by cross-correlation and manual way, and preparation of the corresponding Matlab program to achieve statistical analysis and modeling, and then Some extracts field data validation, business conditions are good, but also make some external partners to develop as applications. This processing efficiency and analysis it is very low, but it is a universal phenomenon.

Industrial Internet of Things and Big Data

Analysis big difference

Things industry data analysis (industrial industry large Things + data), analysis of large data with the traditional Internet, there are a lot of different places.

Different data attributes

1, the huge amount of data

"Amount" of industrial data, need to be considered from several aspects:

Multiple data dimensions conventional things, mostly due to the typically relatively independent sensors, and each sensor data points are often a number of bits, and therefore very little data dimensions.

For industrial things, the more complex production processes are interrelated, each process is multi-dimensional data integration process.

Data dimensions mentioned here encompasses various apparatus features related factors, external conditions, parameters, and other materials and processes in the production process recipe. The number of such dimension level is often tens of thousands, in many high-end automated production (semiconductor) process, data dimensions have reached the million level, of which any change in any one variable of a process, are likely to the final result of the production of produce butterfly effect.

Things conventional sampling frequency diverse data collection interval in seconds is usually, minute level, relatively fixed.

Industrial equipment data sampling frequency span is very large, the different indicators of a device can differ on thousands of times. For the commonly used current fault diagnostic apparatus, a vibration acceleration sensor indicators, often requires more than a sampling frequency of 10KHz, and some state changes, often only a few seconds or even tens of seconds only sampled once.

Data time span of long-term data retention, for accumulation in different states of the characteristics of judgment, very helpful.

The traditional things are not very clear on the long-term data retention requirements, there is not much demand "state of" (Stateless) is.

However, industrial networking, data based on the state (Stateful) analysis, demand is very strong.

First of all, in the traditional industrial sectors, for device status, control threshold, set the key parameters, often come to set and adjusted by the manufacturer or operator experience staff, this value is correct, is the need to go through a long-term data validation ;

Secondly, current, power, torque and other indicators of industrial equipment, in different operating modes, operating conditions, fault conditions are often not the same as there is a clear feature. These features, if they can be saved, by machine learning to train the model feature recognition, will help to achieve precise state decision, anomaly detection and fault diagnosis. Further, samples similar to accumulate the same label data, will help to enhance recognition accuracy. In particular, some critical high reliability device, because of the high cost of failure, the need to save feature abnormal or broken, pass between the member and, between the subsystems and combinations of features between the analysis apparatus, to further improve the reliability sex.

2, real-time

Usually we think of real-time industrial data will be strong, but often refers to the real-time industrial control, rather than real-time data analysis industry.

Traditional industrial data analysis, are often taken by the control system or software system, a piece of data stored as files, write a piece of code (e.g., Matlab) and model by analyzing the person, tested and verified in a laboratory environment, and then development of the corresponding control logic or application program, to evaluate the model developed by the real-time received, continuously during operation of the adjustment parameters of the model. This process is very painful, not only because the source of data and analysis is out of touch, but also because of the need to verify the real-time data during model development is in no way to achieve in the existing environment.

The ideal industrial data analysis, it should be an efficient real-time process. It can intercept valid data samples from a real-time industrial data stream based on different real-time data development languages ​​and modeling framework to develop specific algorithms and models, and authentication is based on real-time data acquisition, then the result of the verification with the true combine real-time decision. The only way to create intelligent analysis and control for specific scenarios.

3, poor quality data

Industrial poor data quality is typical industrial data.

Professional characteristics of the industry, leading to large-scale equipment is often a large integrated subsystems from many different manufacturers. The OEMs often do not understand how each subsystem, and did not form a complete, cross-subsystem control logic and data integration mechanism, it can only choose from among a number of key control signal, the control logic to achieve the stated and not to care about each subsystem works, including a variety of non contribute to the reliability, efficiency and even quality control analysis of indicators.

On the one hand, industrial equipment manufacturers although both claim to be able to reach a variety of indicators, but they often can only guarantee the integrity of key control indicators, but does not guarantee the accuracy and reliability of the key indicators of the subsystem; on the other hand, because there is no good data integration mechanisms are often unable to recognize the work of the different subsystems of the state, and to the post-data analysis for different working conditions of enormous obstacles; thirdly, from the integration of different subsystems, will always be a time stamp is not uniform, the data range wrong, wrong data labels and other common mistakes, even when there is a problem, the OEMs can not explain the significance index subsystem; at the same time, due to the harsh environmental conditions of the site, often result in failure of the sensor data, inaccurate or long-term status; these data quality issues, gave the latter part of the data analysis caused a huge obstacle, before analysis takes a lot of cleaning and processing.

Different methods of data analysis

Mention of big data analysis, many people will naturally think of massive data clustering, classification, mining, to achieve precision marketing, user portrait.

However, these data or Internet business systems, there are some significant assumptions that the large amount of data, the data can be clearly labeled, standardization scene more accuracy less demanding analysis. Through a series of classification, mining, can find common features between different samples for training results have similar attributes of different individuals to speculate with the same or similar attributes of individual features.

1, industrial data analysis challenges

But in the industrial data analysis, these basic assumptions do not exist, data analysis is more challenging:

Small sample anomalies industry often very small, or the probability of occurrence is very low on a single device, which resulted in using conventional big data, machine learning, according to the characteristics of the abnormal data collection, failure to train a stable model ;

According to over-fitting a large number of relevant factors, through machine learning models trained under a specific set of data, even after a large number of test validation data, sounding as perfect fit features, but in a real environment, because the data and variability of operating conditions, often difficult to obtain long-term stable verdict, the situation appears "over fitting".

Clear data is difficult to accurately label industry, even if there are some features can be extracted, but this feature is often with different conditions or operating modes closely related (such as the level of vibration amplitude of the vibration sensor, light or heavy-duty equipment in under completely different), if there is no way to distinguish an abnormal condition characterized by marked, it is difficult to achieve efficient filtration and data analysis;

Scene scene debris chemical fragmentation is difficult to have a common model, even if some similar motor, pump fault model, as well as vibration analysis, general analysis method such as the SPC, on different types of devices, even in the same type on different individuals, it is difficult to ensure unified and stable operation.

These challenges will result in big data analytics industry, the Internet can not be completely analysis of big data, but need to be fully integrated work mechanism to achieve complex modeling and decision.

2, analysis of industry data classification

A general sense of things industry data analysis, can be divided into the following four categories:

Analysis described formula (Descriptive): Things to statistical data collection and display, this part of the statistical analysis based; Diagnostic analysis (Diagnostic): Industrial binding mechanism, the cause of occurrence of an abnormality diagnostic analysis, this part needs to be added many data mining techniques, including correlation analysis, sequence of events analysis; predictive analysis (predictive): by the law of development of long-term historical data to predict trends, this part needs to be introduced include machine learning, neural network technology, trend forecasting; analysis of the way (prescriptive): results of the analysis of data by multiple dimensions, combined with the knowledge base and machine learning, given a variety of possible decision-making basis and provides intelligent decision support; inside in each category, it must be to analyze two levels:

Mechanism Analysis: According to the principle of physical or chemical response to control industrial equipment, processes, and professional analysis produced based on the design principle, which is certainly part of the basis of expertise; data-driven analysis: For many inside the industry can not measurement, unexplained phenomena, you can extract data features, from vast amounts of data to find abnormal points, machine learning methods to make up for lack of expertise; you can see, data analysis, basic industries are industrial mechanism, which is the professional industry knowledge to understand, rather than data analysis methods and capabilities. Without adequate mechanisms and industrial expertise, blind to some of the big data, artificial intelligence tools to analyze the data industry, will be counterproductive.

Three layers into the industrial application scenarios

Play a significant role

We all know that the Internet application scenarios, including large data precision marketing user portrait, massive structured decision support based. Things that industrial data analysis, which can be applied to the scene it?

We believe that the Internet of Things industry big data analysis, from the three levels of the industry, can play a huge role.

Equipment level industrial enterprises can read a variety of real-time parameter control system of intelligent sensors or industrial products, building visual remote monitoring, data acquisition and give historical, building health index system hierarchy of components, subsystems and the whole device and use artificial intelligence to achieve trend forecast; based on the results predicted, and spare parts for the maintenance strategy to optimize management strategies to reduce and avoid the loss of customers because of unplanned downtime;

For example, a predictive maintenance system and fault diagnosis apparatus for a oil drilling machinery manufacturing company, not only to collect real-time critical subsystems different rigs, such as generators, mud pumps, winches, key metrics top drive data, more capable according to the development trend of historical data on the performance of key components evaluated, and the results of component performance prediction, adjust and optimize the maintenance strategy; it is also able, on the efficiency of drilling evaluated based on analysis of real-time status of the rig and optimized to effectively improve the input-output ratio drilling.

Process level of industrial enterprises may be a variety of factors of production phases, such as raw materials, equipment, process recipe and process requirements, by means of digital integration in a close collaboration with the production process, and according to established rules, automatic completion in different conditions operation in combination, automated production process; various types of data recorded simultaneously in the production process, provide the basis for subsequent analysis and optimization.

, The abnormal situation contrary to the policy appears timely processing by collecting real-time operational data from a variety of production equipment on the production line, to achieve visual monitoring of all production processes, and learning to establish monitoring strategies critical device parameters, test indicators through experience or machine and adjustment, stable and continuous optimization of the production process.

For example, an electronic glass production line quality control system constructed in-line, full data acquisition device generates cold end and a hot end, and by machine learning process specifications for optimum production process of the key index is set to monitor the corresponding SPC alarms strategy, and correlation analysis, diagnostic analysis of the specific quality anomalies in the implementation of tens of thousands of data collection points.

Industrial enterprises business layer by layer process a variety of OT data generated, produced or furnished with all kinds of IT business system data combined to build a unified data standards, and on its basis, through a certain calculation and analysis on capable of producing accurate analysis of the operational level, production safety, business efficiency and decision support are provided support, and gradually extended to the external environment, to provide an open data ecology, thus forming more competitive.

For example, a provincial work safety Energy Group provides intellectual control solutions, real-time data is extracted from the production of dozens of different types of real-time database, combined with business data extracted from the third-party business systems, build a unified multi-dimensional data standards and standards-based data integration of IT and OT, including the development of production and operation monitoring, safety management, environmental management, quality management, energy management, business analysis of a range of industrial applications.

Here, many people can not help asking, had not also have a lot of data analysis, including BI and large-screen display there are a lot of analysis reports, why should we increase the industry based on data analysis of things it?

We all know that, at this stage of industrial data analysis, including the aforementioned display and statements are based on business systems, many of which are reporting data, or calculated based on reporting data generated. And the data corresponding to the analysis result of the control system, is free in the conventional data analysis system. However, a true reflection of business equipment, production and management situation, the data and corresponding analysis if not from the control system, is a serious distortion. At the same time, many analysts OT, if not from the associated IT systems, but also can not get accurate results (such as the repair and maintenance records and production data together to achieve quality analysis).

Platform architecture diagram Industrial IT / OT Data Fusion

一个高效运行的工业企业,按照我们的理解,是必须将OT和IT的数据整合到一个大的平台上,并制定严格的数据标准(资产、过程、流程、组织的标准),通过不同的专业数据分析,持续开发不同的新形态的应用(如上图所示),才能满足企业全方位、精准、高效运营的需求。(下一篇我将针对工业企业的IT和OT的融合架构进行探讨,提前预告一下)

企业选对工业物联网平台

将决胜未来

总结一下,物联网,无论是通用的物联网还是工业物联网,如果没有结合专业的精细化的数据分析,是支撑不了企业未来的发展战略的。选择合适的工业物联网平台,将极大的加快企业的数字化进程,朝着智能化的道路快速推进。

 

 

发布了47 篇原创文章 · 获赞 35 · 访问量 2万+

Guess you like

Origin blog.csdn.net/u010199413/article/details/100564913