Spark advanced big data offline and real-time project combat full version

Chapter 1 Course Introduction & Study Guide
chapter will be explained in this course and learning methods introduced.

Getting Started Chapter 2 Redis
Redis is one of the most popular memory database, by reading and writing data in memory, greatly improving the literacy rate. Redis properties from this chapter, application scenarios starting to Redis commands the base, and then to Redis data types commonly used practical operation, the last to operate Redis through Java API, and lay a solid foundation for the subsequent real-time processing project ...

Chapter 3 HBase Getting
HBase is a distributed, column-oriented open source database, the technology comes from Google Fay Chang papers written by "Bigtable: a distributed storage system structured data." This chapter from what HBase yes, what are the characteristics of view, HBase environment to deploy, to HBase data model, to HBase operation (command line & API), and lay a solid foundation for subsequent offline processing project data storage and query. ...

Chapter 4 offline project combat V1
This chapter explains HBase based Spark and offline integrated project combat, from the integration of multiple frames, using Spark to perform ETL processing and landing data to HBase involved in mass participation, HBase Rowkey design, to the initial tuning performance, and finally use the Spark integration HBase statistical analysis of the data. This chapter is based on the Spark to focus on off-line processing, be sure to grasp. ...

Chapter 5 project combat offline optimization
This section Based on a front section further optimization function, based on how the demand functions implemented above, tune, so that higher efficiency in production. This chapter is a key part to enhance its overall strength, be sure to grasp.

Chapter 6 Live project combat
This chapter explains Spark and Redis based on real-time integrated project combat, from Spark Streaming integration Kafka butt of view, how to achieve functional requirements and how to refactor the code so that better efficiency, control Redis data types in the selection of the project actual combat and how to write data into the processed SparkStreaming Redis go. ...

Chapter 7 acquaintance Alluxio
Alluxio to be a center of virtual memory for the distributed storage system, a unified data access and bridge computing framework and the underlying storage system. Application only needs to connect Alluxio can access any data stored in the underlying storage system. This section will Alluxio to bring the benefits of our departure, to Alluxio how to integrate Hadoop and Spark practical operation, and share some Alluxio use cases in large the company. ...

Chapter 8 Spark optimize
this section on best practices from Spark in production starting, and share common optimization strategies Spark's.

Download: the Spark advanced big data offline and real-time project combat

Spark advanced big data offline and real-time project combat full version

Guess you like