Introduction to big data

1. Big data
refers to a collection of data that cannot be captured, managed and processed by conventional software tools within a certain time frame. It requires a new processing model to have stronger decision-making, insight and process optimization capabilities. Adapting to the massive, high growth rate and diverse information assets

in "The Age of Big Data" [2] written by Victor Mayer-Schönberger and Kenneth Cookyer, big data refers to the use of random analysis methods (sample surveys). ) is a shortcut, and all data are used for analysis and processing. <span style="color: #ff0000;">The 5V characteristics of big data (proposed by IBM): Volume (a lot), Velocity (high speed), Variety (variety), Value (value), Veracity (authenticity). The abbreviation is: the volume is diverse, the value is real, and the speed is fast. </span>


 
 

Big data features

Volume: The size of the data determines the value and potential information of the data considered

Variety: The variety of data types

Velocity: The speed at which data is obtained

Variability ): hinders the process of processing and effectively managing data.

Veracity: Quality of data

Complexity: Huge amount of data, multiple sources

Value: Rational use of big data to create high value at low cost

 

 

3. Big data trend

<span style="color: #ff0000;">Trend 1: Resourceization of data</span>

What is resourceization means that big data has become an important strategic resource for enterprises and society, and has become a new focus for everyone to compete for. Therefore, enterprises must formulate big data marketing strategic plans in advance to seize market opportunities.

Trend 2: Deep integration with cloud computing

Big data is inseparable from cloud processing. Cloud processing provides elastic and scalable infrastructure for big data and is one of the platforms for generating big data. Since 2013, big data technology has been closely integrated with cloud computing technology, and it is expected that the relationship between the two will be even closer in the future. In addition, emerging computing forms such as the Internet of Things and mobile Internet will also contribute to the big data revolution and make big data marketing exert greater influence.

Trend 3: Breakthroughs in scientific theories

With the rapid development of big data, just like computers and the Internet, big data is likely to be a new round of technological revolution. The subsequent rise of related technologies such as data mining, machine learning and artificial intelligence may change many algorithms and basic theories in the data world and achieve breakthroughs in science and technology.

Trend 4: The establishment of data science and data alliances

In the future, data science will become a specialized discipline and will be recognized by more and more people. Major colleges and universities will set up special data science majors, and will also spawn a number of new jobs related to them. At the same time, based on the basic platform of data, a cross-domain data sharing platform will also be established. After that, data sharing will be extended to the enterprise level and become a core part of the future industry.

Trend No. 5: Data breaches are rampant
. The growth rate of data breaches may reach 100% in the next few years, unless data is secured at its source. It's fair to say that in the future, every Fortune 500 company will face data attacks, whether or not they're well-secured. And all businesses, large and small, need to revisit today's definition of security. More than 50% of Fortune 500 companies will have the CISO position. Businesses need to secure their own and customer data from a new perspective. All data needs to be secured at the beginning of its creation, not at the very end of data storage. Merely strengthening the latter's security measures has proven to be unhelpful.

Trend 6: Data management becomes the core competitiveness

Data management has become a core competency that directly affects financial performance. When the concept of "data assets are the core assets of enterprises" is deeply rooted in the hearts of the people, enterprises have a clearer definition of data management, regard data management as the core competitiveness of enterprises, sustainable development, strategic planning and use of data assets, and become enterprise data core of management. The efficiency of data asset management is significantly positively related to the growth rate of main business revenue and sales revenue; in addition, for enterprises with Internet thinking, the proportion of data asset competitiveness is 36.8%, and the management effect of data assets will directly affect The financial performance of the business.

Trend 7: Data quality is the key to BI (business intelligence) success

. Companies that adopt self-service business intelligence tools for big data processing will stand out. One of the challenges is that many data sources bring in a lot of low-quality data. To be successful, businesses need to understand the gap between raw data and data analytics to eliminate low-quality data and enable better decisions through BI.

Trend 8: The composite degree of the data ecosystem is strengthened

. The world of big data is not just a single, huge computer network, but an ecosystem composed of a large number of active components and multiple participant elements, terminal equipment providers, infrastructure Providers, network service providers, network access service providers, data service enablers, data service providers, contact services, data service retailers and a series of participants build an ecosystem together. Today, the basic prototype of such a data ecosystem has been formed, and the next development will tend to the subdivision of the internal roles of the system, that is, the segmentation of the market; the adjustment of the system mechanism, that is, the innovation of the business model; the system structure. Adjustment, that is, the adjustment of the competitive environment, etc., has gradually increased the degree of compounding of the data ecosystem.

 

 

 

 

4. Typical Cases of Big Data

1. Macy's real-time pricing mechanism. Based on demand and inventory, the company's SAS-based system makes real-time price adjustments for up to 73 million items.

2. Tipp24 AGBetting and forecasting platform built for the European gaming industry. The company uses KXEN software to analyze billions of transactions and customer characteristics, and then uses predictive models to conduct dynamic marketing campaigns to specific users. This move reduced predictive model building time by 90%. SAP is trying to acquire KXEN.

3. Walmart search. The retail oligarch designed its newest search engine, Polaris, for its website, Walmart.com, using semantic data for text analysis, machine learning and synonym mining, among other things. According to Walmart, the use of semantic search technology has increased online shopping completion rates by 10 to 15 percent. "For Walmart, that means billions of dollars," Laney said.

4. Video analysis of fast food industry . The company analyzes the length of waiting queues through video, and then automatically changes what is displayed on the electronic menu. If the queue is long, show foods that can be served quickly; if the queue is short, show those that are more profitable but take a relatively long time to prepare.

5. Morton's steakhouse brand recognition Morton started himself when a customer jokingly tweeted an order for the Chicago-based steakhouse chain to be delivered to Newark Airport in New York (where he would be arriving after a day's work). social show. First, analyze the Twitter data and find that the customer is a frequent visitor to our store and a frequent Twitter user. Based on the customer's previous orders, they guessed the flight they took, and then sent a waiter in a tuxedo to serve the customer dinner.

6. PredPol Inc. Working with police in Los Angeles and Santa Cruz and a team of researchers, PredPol predicts the odds of a crime to within 500 square feet based on a variant of an earthquake prediction algorithm and crime data. In Los Angeles districts where the algorithm is used, the distribution of DQ crimes and violent crimes dropped by 33% and 21% respectively.

7. Tesco PLC (Tesco)and operational efficiency. The supermarket chain has collected data on 7 million refrigerators in its data warehouse. Analysis of this data enables more comprehensive monitoring and proactive maintenance to reduce overall energy consumption.

8. American Express (AmEx, AmEx) and business intelligence. In the past, AmEx could only achieve hindsight reporting and lagging forecasts. "Traditional BI has been unable to meet the needs of business development." Laney believes. As a result, AmEx began to build a model that can truly predict loyalty, using 115 variables to analyze and predict based on historical transaction data. The company said it had been able to identify 24 per cent of customers in Australia who would churn over the next four months.

9. Los Angeles traffic: Anyone who has ever driven in Los Angeles must have experienced the nightmarish traffic jams there. The current government has established a toll express lane on I-10 and I-110. The government can guide the driving conditions of drivers on the channel through big data to ensure smooth traffic. Xerox is the company involved in this project, and its anti-congestion project includes the idea of ​​using ExpressLanes, dynamic pricing, rising demand, etc. to maintain a certain order. Natesh Manikoth, chief technology officer at Xerox, said that if a driver is paid to drive in hot lanes (high-occupancy tolling system), he must maintain a speed of around 45 miles per hour. If traffic starts to become congested, the price paid by private cars will rise to reduce their entry and use the lanes for high-occupancy vehicles such as buses and coaches.

Xerox has another program in Los Angeles called ExpressPark, whose goal is to let people know when they're about to leave the house, where to find parking and how much to spend. Not only to ensure pricing, but also to ensure that data reaches users in real time. For example, the user should be informed of the parking location 40 minutes in advance.

</div>

 

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=327041863&siteId=291194637