July online Spark Big Data combat classes

Course Outline


Phase I: Getting Started with Big Data spark Introduction

Lesson 1: Getting Started with Big Data Overview

Knowledge Point 1: The history of big data technology

Knowledge Point 2: big data applications as well as future

Knowledge Point 3: hadoop ecosystem Introduction

Knowledge Point 4: hadoop Evolution and Development Framework

Knowledge Point 5: Large data storage system hdfs analytical principle

Knowledge Point 6: map-reduce parsing principle

Knowledge Point 7: distributed resource management principles to resolve yarn

Actual project: the development of work-based mr yarn combat

Lesson: spark an overview of the development of the technology stack

Knowledge Point 1: spark Past and Present

Knowledge Point 2: spark1.X technology stack Overview

Knowledge Point 3: spark2.4 technology stack Overview

spark3.0 and future outlook: 4 knowledge

Knowledge Point 5: spark applications in large companies

Actual project: running a spark program

Lesson: spark API application development and introduction

Knowledge Point 1: spark explain the core concepts

Knowledge Point 2: partition and dependence of rdd

Knowledge Point 3: rdd API to explain the transformation

Knowledge Point 4: rdd API in action explain

Actual project: Use spark rdd for log data analysis

The second stage: spark principle analysis and application tuning

Lesson Four: spark principle and mode of operation

Knowledge Point 1: spark operating mode

Knowledge Point 2: spark explain the implementation process

Knowledge Point 3: spark internal principle rdd Comments

Knowledge Point 4: spark broadcast variable accumulator explain

Actual project: the use of variable broadcast encoding user information to achieve the recommended system

Lesson: spark cluster applications and optimization analysis

Knowledge Point 1: spark web ui explain

Knowledge Point 2: spark application monitoring and analysis

Knowledge Point 3: spark history server principle analysis

Knowledge Point 4: spark metrics monitoring

Real items: spark history server build deployment

Actual projects: from monitoring to start a log troubleshooting and optimization

Lesson Six: spark core Core explain

Knowledge Point 1: spark shuffle three modes Detailed

Knowledge Point 2: spark memory management analysis

Knowledge Point 3: spark Resource Management Application

Knowledge Point 4: spark rdd Storage Management

Actual project: Reconstruction and optimization of existing applications spark

Lesson Seven: spark Performance Tuning

Knowledge Point 1: spark development Tuning

Knowledge Point 2: spark resource tuning

Knowledge Point 3: spark inclined tuning data

Knowledge Point 4: spark tuning memory management

Real items: spark shuffle tune the code case

The third stage: spark ad hoc queries and explain the flow calculation

Lesson Eight: spark sql explain

Knowledge Point 1: History spark sql development

Knowledge Point 2: spark sql 1.X and 2.X

Knowledge Point 3: spark operating principle sql analysis

Knowledge Point 4: spark sql logic to explain the principles of the plan

Knowledge Point 5: spark sql physical principles to explain the plan

Knowledge Point 6: dataset and explain dataframe

Knowledge Point 7: spark sql udf development of custom registration function

Knowledge Point 8: spark thrift server explain

Actual project: Based spark sql king of glory hero 2.4.0 Analysis

Lesson 9: Introduction to computing flow and spark streaming

Knowledge Point 1: spark streaming | storm | flink | structured streaming comprehensive comparison

Knowledge Point 2: The Message Queuing kafka, rocket mq resolve practical

Knowledge Point 3: spark streaming operating principle

Knowledge Point 4: spark streaming high-level abstraction dstream

Knowledge Point 5: structured streaming operating principle Introduction

Actual project: Code read real-time log data and statistics

Lesson Ten: Real-time computing platform (design and actual)

Knowledge Point 1: Introduction to Real-Time Big Data architecture (kudu, druid, couchbase)

Knowledge Point 2: Real-time computing platform architecture design and selection method

Knowledge Point 3: real-time calculation of practice and difficult analysis, analysis of performance bottlenecks and high qps

Real items: real-time log platform statistics

The fourth stage: spark view of computing and high-end applications of machine learning

Lesson Eleven: spark diagram to explain the computing and mlib

Knowledge Point 1: Introduction property map

Knowledge Point 2: edge, vertex, triplet introduction and create

Figure of operational attributes: knowledge point 3

Knowledge Point 4: graph algorithms Introduction

Knowledge Point 5: spark mlib Introduction

Real items: Tuning of FIG.

Lesson Twelve recommendation combat system

Knowledge Point 1: Scene recommendation system, why the need for recommendation systems

Process Description recommendation system: Knowledge Point 2

Knowledge Point 3: collaborative filtering recommendation algorithm

Knowledge Point 4: youtube recommendation Introduction

Actual project: collaborative filtering recommendation based on the spark mllib


Obtaining (Remarks Spark big data)


15544094-470bc978b5fa509b.png

Reproduced in: https: //www.jianshu.com/p/a54d32cf2d90

Guess you like

Origin blog.csdn.net/weixin_34014277/article/details/91247961