(2023 version of self-study route) the most complete big data learning route map

With the rapid development of the information industry and the gradual implementation of big data applications, the demand for talents in the industry is expanding year by year. Big data has become one of the most promising high-paying industries at present, and big data talents such as big data analysis engineers and big data development engineers have also become talents in short supply in the market, and their salaries have risen again and again.

Many people want to join the ranks of big data development, but do not know how to start. Next, the editor will share with you a complete big data learning route to help you get started quickly!

The first stage
is JAVASE+MYSQL+JDBC, mainly learning some Java language concepts, such as characters, process control, object-oriented, process thread, enumeration reflection, etc., learning the installation and uninstallation of MySQL database and related operations, learning the realization principle of JDBC As well as the basic knowledge of Linux, it is the initial stage of big data.

second stage

It is an introduction to distributed theory, mainly explaining CAP theory, data distribution mode, consistency, 2PC and 3PC, and big data integration architecture. The knowledge points involved include consistency, availability, partition tolerance, data volume distribution, 2PC process, 3PC process, hashing method, consistent hashing, etc.

The third phase

For data storage and computing (offline scenarios), mainly explain the coordination service ZK (1T), data storage hdfs (2T), data storage alluxio (1T), data collection flume, data collection logstash, data synchronization Sqoop (0.5T), data Synchronization datax (0.5T), data synchronization mysql-binlog (1T), calculation model MR and DAG (1T), hive (5T), Impala (1T), task scheduling Azkaban, task scheduling airflow, etc.

The fourth stage
is the construction of the data warehouse, mainly explaining the historical background of the data warehouse, the technical analysis of the offline data warehouse project-Banwo (5T) architecture, the deployment and installation of multidimensional data model processing kylin (3.5T), and the offline data warehouse project-Banwo After the car is upgraded, kylin is added for multi-dimensional analysis, etc.;

fifth stage

It is a distributed computing engine. Mainly explain the computing engine, scala language, spark, data storage hbase, redis, kudu, and realize spark multi-data source reading and writing through a p2p platform project.

sixth stage

For data storage and calculation (real-time scenarios), it mainly explains the data channel Kafka, real-time data warehouse druid, stream data processing flink, SparkStreaming, and explains a certain traffic data so that you can master the knowledge points.

seventh stage

For data search, it mainly explains elasticsearch, including full-text search technology, ES installation operation, index, index creation, addition, deletion, modification, indexing, mapping, filtering, etc.

eighth stage

For data governance, it mainly explains data standards, data classification, data modeling, graph storage and query, metadata, lineage and data quality, Hive Hook, Spark Listener, etc.

ninth stage

For the BI system, mainly explain the two major technologies of Superset and Graphna, including basic introduction, installation, data source creation, table operation and data exploration and analysis.

tenth stage

For data mining, it mainly explains the mathematical system in machine learning, Spark Mlib machine learning algorithm library, Python scikit-learn machine learning algorithm library, and machine learning combined with big data projects.

The era of big data has arrived, and it will set off a huge wave. If you want to grasp this wave, you must start as soon as possible.

Digression

In this first year of fast-growing technology, programming is like a ticket to a world of infinite possibilities for many people. In the star lineup of programming languages, Python is like the leading superstar. With its concise and easy-to-understand syntax and powerful functions, it stands out and becomes one of the most popular programming languages ​​in the world.


The rapid rise of Python is extremely beneficial to the entire industry , but " 人红是非多" has caused it to add a lot of criticism, but it still cannot stop its hot development momentum.

Will Python remain relevant and intact for the rest of the next decade? Today, we're going to analyze the facts and dispel some misconceptions.

If you are interested in Python and want to get a higher salary by learning Python, then the following set of Python learning materials must be useful to you!

Materials include: Python installation package + activation code, Python web development, Python crawler, Python data analysis, artificial intelligence, machine learning and other learning tutorials. Even beginners with 0 basics can understand and understand. Follow the tutorial and take you to learn Python systematically from zero basics!

1. Learning routes in all directions of Python

The route of all directions in Python is to organize the commonly used technical points of Python to form a summary of knowledge points in various fields. Its usefulness lies in that you can find corresponding learning resources according to the above knowledge points to ensure that you learn more comprehensively.
insert image description here
2. Python learning software

If a worker wants to do a good job, he must first sharpen his tools. The commonly used development software for learning Python is here!
insert image description here
3. Python introductory learning video

There are also many learning videos suitable for getting started with 0 basics. With these videos, you can easily get started with Python~insert image description here

4. Python exercises

After each video lesson, there are corresponding practice questions, you can test the learning results haha!
insert image description here

Five, Python actual combat case

Optical theory is useless. You have to learn to type codes along with it, and then you can apply what you have learned in practice. At this time, you can learn from some practical cases. This information is also included~insert image description here

6. Python interview materials

After we have learned Python, we can go out and find a job with the skills! The following interview questions are all from first-line Internet companies such as Alibaba, Tencent, and Byte, and some Alibaba bosses have given authoritative answers. After reading this set of interview materials, I believe everyone can find a satisfactory job.
insert image description here
insert image description here
7. Information collection

The full set of learning materials for the above-mentioned full version of Python has been uploaded to the CSDN official website. Those who need it can scan the QR code of the CSDN official certification below on WeChat to receive it for free.

Guess you like

Origin blog.csdn.net/pythonhy/article/details/132209767