The era of big data, you need to know six things

Why Big Data will change the enterprise? Because big data is a new way of thinking. In the past, we always consider the sample space, particle size, selection methods and so on, and as the popularity of big data, all these problems called the past tense; formerly very complex issue, and now has become increasingly simple. However, because of the large data marvelous effect, so more and more users begin to deify big data, use large data also generated some misunderstandings.

There is no doubt that Big Data is a comprehensive systematic project, and in this field Informatica is recognized as a pioneer. As a pioneer, Informatica for Big Data has its own unique insights.

As a premise, on big data, you need to know six things

1, Big Data has been thinking, still far away from success

Big data really started began last year, by trying two years, the accumulation of ideas already, but still far away from success. Some big foreign data case, the story is nothing more than big data business intelligence (BI), data warehouse (BW) of a face-lift, just old wine in new bottles. As the data warehouse as the construction of nearly 20 years to allow each company to admit their true value, big data can not be expected soon to be successful, you need a settling time.

The development of a large data can be used to describe the wave-like map, still in the first peak, trough must be raised again, repeated rounds. During this period, you may find many real cases, regardless of success or failure will give us inspiration. Just try not necessarily a complete failure, as the data warehouse construction, a few years ago, many reports show that 80% of projects fail, but after careful analysis, but did not reach the expected value in the development process it. Former walk through the road, people can go back less minefield.

2, really big data thinking: do not allow the accuracy of the data

Previously, all data available due to the small amount of data, for which we must try to lower accurate records obtained, the leadership made a KPI for reference, the accuracy of the sampling process is in an important position. Obviously, this dedication to the accuracy of the information era of lack of product. The era of big data, data collection issues no longer be troubled by collecting the full amount of data a reality, but the emergence of huge amounts of data will add to the confusion of the data and cause inaccuracies result, if still obsession accuracy, it will not be respond to this new era.

Big data typically are used probability speak, and before the large data processing is to be cleaned so as to reduce the error data section. Therefore, compared with the commitment to avoid errors, errors of inclusion will give us more information. In fact, allow mixed and allowed the results of the imprecision of the data we embrace big data is the correct attitude, as long as the results are 10% accurate, able to reach a growth business dozens of times, this is the real thinking big data in the future we should be accustomed to this way of thinking.

3, big data is not a purely technical issue

Big data is not a purely technical issue, it will contain a lot of content management, business area. Not to say, buy a set of data mining tools, set up a Hadoop environment, can be called to do a big data. In addition to equipment, technology investment, companies also need to have a transition from the organizational structure, staff awareness, management and corporate culture. A lot of preparatory work for big data, which is a comprehensive change in the kind of thinking. We are all feeling the stones, take a step to think about, and then another step to think about until the last successful landing.

Among such a process, people's minds have to follow the development of big data technology constantly updated, but also some idea of ​​the past to correct and change. Of course, this time not spent 20 years as before as data warehousing, big data may be cut by half time. Because the data warehouse from scratch era, but the era of big data has to be better from, people have accumulated a lot of experience, technology, lessons learned from the construction of the data warehouse, or even effective management methods, can be a good reference.

4, Big Data technology to solve the problem of unstructured data, Definitely

Emerging big data technology provides a very effective means, so that people can spend a very low price to analysis, processing unstructured data, but unstructured data has a feature that density is still very low, it is far less structure data density has a very high value, may unstructured data 100G, and ultimately effective only 1G. This shows that unstructured data is a great complement to the integrity of the data, but it does not make that big data is unstructured data, in fact, the ultimate purpose is to explore the value of the data. On the other hand, traditional data warehouse has been able to complete 90% of the conventional configuration using data level of, in this context, it will focus on the large data processing unstructured pair.

Current, generating a large number of unstructured data, such as logs machine, sensor data, social media data, are present in unstructured form, in a conventional manner and the data processing capabilities of these lacks. If the analogy with the barrel effect, first of all on the short board to put this up, after the efficiency and capacity of structured data processing flush, it is to focus more on how to use data to conduct deeper research. But also recognize that Big Data technology capable of handling semi-structured and unstructured data, but these data are always to be converted into structured data to analysis, the algorithm might enter unstructured, such as video information, However, just came in less than 10 seconds to become structured, the final results are displayed in tabular or structured.

5, a necessary prerequisite for big data applications

In front of a large data clutter, there is no good data quality, there is no more good data management strategy for the investment business applications with the application of a combination of growth and expansion within the enterprise and increasingly shrinking. Big data, 90% of companies take the path can not achieve flashy fireworks style effect, they are all, to solve data integration, data quality and master data management issues get down, and these are precisely the company Informatica core competence.

Now, more and more enterprises from the previous development of data management rough way to the fine direction, pay more attention to data quality and master data management, focused on building a full view of the data, and so on. Currently, Informatica already has 11 areas, more than 30 small term solution is, Informatica data integration platform provides the data into a trusted, viable and reliable information assets required for full functionality, you can integrate any data anywhere, anytime debris, or control the enterprise "cloud" of data, high speed data transfer, data sharing with partners, to find and resolve data quality problems, given your ability to take the initiative with the data, create a reliable view on the most important data assets, etc. these techniques can be combined seamlessly with the operation, and can be used to reduce total cost of ownership by effectively utilizing the hardware infrastructure, to achieve a more refined data management.

6, the era of big data, privacy above all else

The past two years, the state government emphasis on information security, companies are very concerned about data security issues. Sensitive personal, financial and health information regulated by a variety of different industries and government data privacy regulations, if the company can not maintain data privacy, they will face serious financial and legal penalties, but also in terms of customer and market confidence suffered considerable loss. In this case, the data desensitization technology came into being, and in the last two years, began to be used more and more business users. Informatica data reading and using the privacy aspects of the use of dynamic or static data desensitization means to ensure the data, change its value while preserving the original features of the data to protect sensitive data from unauthorized access, while It can be related to data processing.

2014, Informatica Data security solutions due to meet market demand and become hot faster cell growth. And the other is a large market demand data archiving class program. Business after several years of development, has accumulated a large amount of historical data want to archive, Informatica provides a number of computing, storage and other new technologies, is able to archive historical data retention and analysis, manage the full life cycle of data.

Informatica idea of ​​IDP

What Big Data era is most lacking? It is able to effectively control and data analysis, and this refers not only to high-end data analysis experts, including the ability to more effectively access and use the data value of the enterprise business personnel. Informatica IDP (Intelligent Data Platform) intelligent data integration platform that allows business data to become a true beneficiaries, IDP direct-to-business sector, closely linked to the people, places and things in a more intelligent way, business people according to their needs, self-obtain data he wants.

From the point of view means to achieve, IDP IT is not a tool, like a platform, including self-service data, data virtualization, etc., these methods will be presented by means of technology to a wide variety of data from the bottom the end-user in front of the business, give them the freedom to select the desired use, browse, what kind of data analysis, and even be able to participate in the operation data in the past. IDP currently no product landing, but from the development of the concept, this is the Informatica traditional business, superior products and intelligent products for a complete solution portfolio.

Highly recommended reading articles

Big Data engineers must understand the concept of the seven

The future of cloud computing and big data Five Trends

How to quickly build their own knowledge of large data

Guess you like

Origin blog.csdn.net/chengxvsyu/article/details/91416772