It is said that big data is good for employment, why is data analysis so difficult to obtain employment?

Big data and data analysis are still two different directions~

What does big data development do?

There are two types of big data development, writing Hadoop and Spark applications and developing the big data processing system itself. The big data development engineer is mainly responsible for the development and maintenance of the company's big data platform, architecture design and product development of related tool platforms, network log big data analysis, real-time computing and streaming computing, data visualization and other technology research and development, and network security business theme construction. model work.

Skills required for big data development:

The languages ​​currently engaged in the development of big data applications include Java, Python, Scala, R, etc. It is necessary to be familiar with the principles and usage methods of Hadoop, HBbase, hive, spark, Flink, ES, Presto, Flume, and Kafka ecology, and master data development and data mining of various processes.

What does a big data analyst do?

Big data analysts are a position that is particularly valued in the era of big data, especially big data analysis talents with professional skills and industry experience, which are the sweet pastries that companies are vying for. With the further development of the big data industry, the demand for talents has increased, and the training of big data analysts has also increased.
Big data analysts, as companies pay more and more attention to the value of data, the daily work of big data analysts can be summarized as mining valuable information from massive data. Big data analysis involves data acquisition, data access, data preprocessing, data modeling and analysis, data visualization and other links.

Skills a Big Data Analyst should have:

Familiar with the use of Excel, skilled use of charts and functions, and VBA programming skills are preferred; familiar with the use of MySQL database, proficient in SQL-related DML data manipulation language; understand basic data analysis methods, including: descriptive analysis, regression analysis , ANOVA; relevant data visualization work experience, work experience in automated report development work experience is preferred.
Data analysts, as the name implies, refer to those who specialize in data analysis. The data analyzed is mainly structured data. In recent years, more and more text data have been analyzed.

More generally speaking, a data analyst is actually a translator, a person who translates data into conclusions that the other party can understand.

The following data with rows and columns is structured data, and it is also the data that we usually analyze and use the most.

insert image description here

Data analysts in different industries have certain differences. Some are R&D positions, such as data mining engineers, machine learning engineers, and data engineers; some are business-oriented positions, such as operational analysis experts, user research engineers, business analysts, etc. .
In these positions, all involve analyzing data to solve problems, but the emphasis will be slightly different in the whole work process.

So, let's take a look at the workflow of data analysis now:
insert image description here
Step 1 : Data analysis is initially driven by a clear problem, such as Internet companies often encounter [A significant increase in the number of APP daily users last week/ The phenomenon of decline], at this time, it is necessary to find the answer through data analysis.

Of course, there will also be situations in some fields [analyze the data as soon as there is no clear problem], such as universities and other scientific research institutes, the leaders may directly give you a batch of data for you to dig, see What conclusions can be drawn? In fact, this kind of data analysis work is essentially different from the data analysis work often said in enterprises. I can elaborate on this aspect in detail later.

Step 2 : Can this problem be subdivided into multiple sub-problems. A large and complex problem is usually difficult to solve with a data analysis method, and needs to be refined into multiple small problems, and each small problem can be solved with a simple data analysis method. Moreover, according to the subdivided small problems, we can know which data to collect, what analysis methods to use, what kind of charts to make, etc. for each small problem. This step is very critical in the data analysis process, and it is to examine our research and design capabilities. An important basis for judging high and low.

Step 3 : According to each small subdivision problem, collect corresponding data.

Step 4 : According to the collected data, choose the appropriate data analysis method accordingly, and get subdivided conclusions one by one.

Step 5 : Summarize the complete conclusion.

Step 6 : Evaluate whether the conclusions can reasonably explain the original question, this step is as important as step 2.

After talking about the process of data analysis, let's come back to the job content of data analysts. At present, 70% of the workload of data analysts in Internet companies is mainly concentrated on data collection, sorting and preprocessing. This is determined by the nature of data analysis. Data must be collected first and then cleaned. Do the following analysis work, and collecting data and cleaning data are the most tiring tasks in data analysis.

The remaining 30% of the work includes designing indicators, analyzing data using tools (Excel, Tableau, SPSS, R, Sass, Python, EViews, Stata, etc.), writing reports, holding meetings, etc.

However, many people who are new to data analysis always feel bored, disappointed, and collapsed when faced with the repetitive work of processing a large amount of data all day long, and even change careers before they have touched the next 30% of the work thoughts...

In fact, these are all the "jobs" of data analysts. Only by doing solid work in the early stage can the analysis work in the later stage be completed more beautifully.

If you feel that the direction of data analysis is not easy to find a job, you might as well try the direction of big data development~

Released by Boss Zhipin, the demand growth of recruitment data and big data this spring ranked second,

Liepin released the five fields with the fastest year-on-year growth in new jobs since 2019. The top five are: artificial intelligence, manufacturing, big data, medical care, and energy and environmental protection.

The "2020 White Paper on the Development of China's Big Data Industry" shows that in 2019, the scale of China's big data industry reached 539.7 billion yuan, a year-on-year increase of 23.1%, and then grew steadily. It is expected to exceed one trillion yuan by 2022.

According to the statistical results of LinkedIn, CCID Think Tank, Lagou.com and other institutions, the overall gap of data talents in the era of big data is showing a growing state of intensification. In the past three years, the data talent gap has been increasing by 500,000 people per year. It is estimated that in 2022, after college graduates majoring in big data enter the job market on a large scale, the growth rate of the overall gap will slow down, but this gap is still will exist for a long time.

Recruitment is available, but applicants often encounter various problems in finding a job because of their academic qualifications and work experience. So what is the specific situation of developers who have been engaged in big data now? Let's look at the following points:

1. Academic level

From the perspective of education level, the education level of my country's big data talents is divided into 4 categories, namely master's degree and above, bachelor's degree, junior college, and junior college, among which the big data talents with bachelor's degree are the most, accounting for as high as 65.45%. Followed by master's degree and above, and big data talents with junior college degree and below account for only a small part. It can be seen that the big data industry, as an emerging industry, generally has relatively high educational requirements for talents.

2. Professional source

In terms of professional sources, the professional sources of big data talents in my country are mainly composed of four major categories: mathematics and science, economic management, computer and other majors, of which computer science accounts for the highest proportion, followed by mathematics and science.

3. Channel source

The channel sources of big data talents are divided into four categories, namely school recruitment, social recruitment, internal training and recommendation, and training institution recruitment. See the figure below for the number and proportion of the sources of big data talents in enterprises.

Among them, social recruitment accounts for the largest proportion, which is higher than the sum of school recruitment, internal training and promotion, and training institution recruitment. At present, it mainly relies on social recruitment, which shows that school education is out of touch with social needs, and internal training and training cannot meet job requirements.

4. Salary level distribution

At present, the salary of big data talents is at a relatively high level. Salaries below 10,000 yuan accounted for 34.6% of the total; 10,000 to 20,000 yuan accounted for 35.64%; and above 20,000 yuan accounted for 29.77%.

5. Type and number of posts

At present, the big data positions provided by enterprises can be divided into the following categories according to the job content requirements:

① Primary analysis category, including business data analysts, business data analysts, etc.

② Mining algorithms, including data mining engineers, machine learning engineers, deep learning engineers, algorithm engineers, AI engineers, data scientists, etc.

③ Development and maintenance, including big data development engineers, big data architecture engineers, big data operation and maintenance engineers, data visualization engineers, data acquisition engineers, database administrators, etc.

④ Product operation category, including data operation manager, data product manager, data project manager, big data sales, etc. The number and proportion of the four types of posts are shown in the figure below.

The demand for big data is increasing, and the country is also opening related jobs, which have increased year by year since 2018.

At this time, students and parents who apply for university are also very interested in big data and artificial intelligence. Big data has entered the top 5 for three consecutive years, and a bachelor's degree is all that is required.

In the foreseeable next few years, this is really a sunrise industry, and there is a big gap now.

High salaries and large gaps naturally become the "salary" choice for professionals in the workplace!

Any learning process requires a scientific and reasonable learning route in order to be able to complete our learning goals in an orderly manner. The content required to learn Python+big data is complex and difficult. We have compiled a comprehensive Python+big data learning roadmap for you to help you clarify your thinking and overcome difficulties!

Detailed introduction to Python+big data learning roadmap

Getting Started with Big Data Development in Phase 1

Pre-study guide: Start with traditional relational databases, master data migration tools, BI data visualization tools, and SQL, and lay a solid foundation for subsequent learning.

1. Big data data development foundation MySQL8.0 from entry to proficiency

MySQL is the entire IT basic course, and SQL runs through the entire IT life. As the saying goes, if SQL is well written, you can find a job easily. This course fully explains MySQL8.0 from zero to advanced level. After studying this course, you can have the SQL level required for basic development.

2022 latest MySQL knowledge intensive lecture + mysql practical case _ a complete set of tutorials from zero-based mysql database entry to advanced

The core foundation of big data in the second stage

Pre-study guide: learn Linux, Hadoop, Hive, and master the basic technology of big data.

2022 Big Data Hadoop Introductory Tutorial
Hadoop offline is the core and cornerstone of the big data ecosystem, an introduction to the entire big data development, and a course that lays a solid foundation for the later Spark and Flink. After mastering the three parts of the course: Linux, Hadoop, and Hive, you can independently realize the development of visual reports for offline data analysis based on the data warehouse.

2022 latest big data Hadoop introductory video tutorial, the most suitable big data Hadoop tutorial for zero-based self-study

The third stage of hundreds of billions of data warehouse technology

Pre-study guide: The course at this stage is driven by real projects, learning offline data warehouse technology.

Data offline data warehouse, enterprise-level online education project practice (complete process of Hive data warehouse project)
This course will establish a group data warehouse, unify the group data center, and centralize the storage and processing of scattered business data; the purpose is from demand research, design, Version control, R&D, testing, and launch, covering the complete process of the project; digging and analyzing massive user behavior data, customizing multi-dimensional data sets, and forming a data mart for use in various scene themes.

Big Data Project Practical Tutorial_Big Data Enterprise Offline Data Warehouse, Online Education Project Practical (Complete Process of Hive Data Warehouse Project)

The fourth stage PB memory computing

Pre-study guide: Spark has officially adopted Python as the first language on its homepage. In the update of version 3.2, it highlights the built-in bundled Pandas; Spark content.

1. From entry to mastery of python (19 days)

Python basic learning courses, from building the environment. Judgment statements, and then to the basic data types, and then learn and master the functions, familiarize yourself with file operations, initially build an object-oriented programming idea, and finally lead students into the palace of python programming with a case.

A full set of Python tutorials_Python basics video tutorials, essential tutorials for self-study Python for zero-basic beginners

2. Python programming advanced from zero to website building

After completing this course, you will master advanced Python syntax, multi-tasking programming, and network programming.

Python Advanced Grammar Advanced Tutorial_Python multitasking and network programming, a complete set of tutorials for building a website from scratch

3.spark3.2 from basic to proficient

Spark is the star product of the big data system. It is a high-performance distributed memory iterative computing framework that can handle massive amounts of data. This course is developed based on Python language learning Spark3.2. The explanation of the course focuses on integrating theory with practice, which is efficient, fast, and easy to understand, so that beginners can quickly master it. Let experienced engineers also gain something.

Spark full set of video tutorials, big data spark3.2 from basic to proficient, the first set of spark tutorials based on Python language in the whole network

4. Big data Hive+Spark offline data warehouse industrial project actual combat

Through the big data technology architecture, it solves the data storage and analysis, visualization, and personalized recommendation problems in the industrial Internet of Things manufacturing industry. The one-stop manufacturing project is mainly based on the Hive data warehouse layer to store the data of various business indicators, and based on sparkSQL for data analysis. The core business involves operators, call centers, work orders, gas stations, and warehousing materials.

For the first time, the entire network disclosed the actual combat of big data Spark offline data warehouse industrial projects, and Hive+Spark built an enterprise-level big data platform

Guess you like

Origin blog.csdn.net/weixin_51689029/article/details/128037790#comments_25907212