DOE Education Project Video Outline Document

The possibility of plagiarism in this project is not too great, because plagiarism can only be the requirements, and only part of the code can be copied, because there are several different codes in several issues. Let's see what the project talks about?

Comprehensive Project-Module 1-Data Warehouse-day01

01. Pre-knowledge of project development--git version management--gitee code cloud-submit-pull-branch operation.wmv

02. Project background introduction.wmv

03. Project background introduction (2).wmv

04. Project Module 1-Introduction to Data Warehouse-Dictionary Data Construction Requirements.wmv

05. Clarify the concepts of database and data warehouse.wmv

06. Project development project skeleton construction and testing.wmv

07. Project development-geographic location dictionary construction-geohash coding principle and toolkit.wmv

08. Project development-geographic location dictionary construction-code implementation (1).wmv

09. Project development-business district dictionary construction-code implementation.wmv

10. Project development-company internal data-detailed analysis of traffic log.wmv

11. Project development-internal data preprocessing-requirements description.wmv

12. Project development-internal data preprocessing-code skeleton writing.wmv

13. Introduction to AutoNavi Geolocation Service API.wmv

14. Gaode geographic location service api-write demo example.wmv

Comprehensive Project-Module 1-Data Warehouse-day02

01. Implementation of internal traffic log preprocessing code (1).wmv

02. Implementation of internal traffic log preprocessing code (2).wmv

03. Implementation of internal traffic log preprocessing code (3).wmv

04. Customize the schema to let spark automatically parse the json data into dataframe.wmv

05. Explanation of data warehouse modeling thinking-business-demand-theme-layering.wmv

06. Data warehouse ods layer modeling-table building-data loading-detection.wmv

Comprehensive Project-Module 1-Data Warehouse-day03 

01.SQL key grammar review and combing.wmv

02. Traffic analysis-dwd_traffic_log table processing generation.wmv

03. Traffic analysis-dwd_traffice_agg_session session level schedule.wmv

04. Traffic analysis-traffic profile dimension report-ads_traffic_summary_cube.wmv

05. User analysis-modeling design-detailed process. Wmv extra: how to copy Tao Ge's CDH virtual machine cluster and network configuration.wmv

Extra: Supplement two small hive skills-multiple insertion-dynamic partition.wmv

Comprehensive Project-Module 1-Data Warehouse-day04

01.Olap data cube multi-dimensional analysis--hive higher-order aggregation function--groupingset--cube.wmv

02.olap data cube multi-dimensional analysis--hive higher-order aggregation function--grouping__id-rollup.wmv

03. User analysis-daily new dws_user_dnu-daily active dws_user_dau-history dws_user_hisu-table development.wmv

04. User analysis-multi-dimensional report on the number of new people in a day-ads_user_dnu_cube.wmv

05. User analysis-daily new daily life plus dimension (week-month-quarter)-automated shell script development.wmv

06. As of today's etl process combing-automated script development.wmv

Comprehensive Project-Module 1-Data Warehouse-day05

01. Errata: List of historical user records -fulljoin-Forget to write conditions.wmv

02. Script development for all tasks as of today (2).wmv

03. Script general scheduling development.wmv

04. User analysis-retention analysis-modeling design.wmv

05. User analysis-retention analysis-retention schedule calculation.wmv

06. User analysis-active zipper table-modeling and calculation process.wmv

07. User analysis-active zipper table-code writing.wmv

Comprehensive Project-Module 1-Data Warehouse-day06

01. User retention analysis-modeling design-operation logic-zipper table calculation logic review.wmv

02. Report development-overall trend report-model design-calculation process combing.wmv

03. Report development-overall trend report-ads_overall_trend development.wmv

04. Report development-user freshness report-ads_user_fresh modeling.wmv

05. Report development-user freshness report-ads_user_fresh development.wmv

06. Report development-user active retention report-ads_user_act_retention.wmv

07. Report development-user active retention report-solution 2-with-as must be written first.wmv

08. Report development-active user composition analysis report (continuous days)-ads_user_act_ingredients.wmv

Comprehensive Project-Module 1-Data Warehouse-day07

01. Report statistics-user interval distribution statistics-ads_user_interval-spark task realization.wmv

02. Report statistics-user interval distribution statistics-ads_user_interval-sql implementation.wmv

03. Event analysis theme-background introduction to event log data acquisition.wmv

04. Event analysis theme-detailed concept of conversion rate (funnel model)-demand analysis.wmv

05. Event analysis theme-DWD layer modeling etl-dwd_event_detail.wmv

06. Event Analysis-Event Overview Report-ads_event_overall.wmv

Extra: Detailed explanation of mapreduce-shuffle ring buffer.wmv

Extra: Detailed explanation of yarn's three resource scheduling strategies.wmv

Comprehensive Project-Module 1-Data Warehouse-day08 

01. Access path analysis-dwd layer path analysis schedule-dwd_routes_detail.wmv

02. Access path analysis-ads layer path analysis report-ads_routes_rpts.wmv

03. Analysis of business path conversion rate-modeling-calculation thinking design.wmv

04. Business path conversion funnel analysis-code implementation-ads_routes_step_detail.wmv

05. Advertising effect analysis theme-DWS and ADS layer modeling design.wmv

06. Advertising effect analysis-ads layer report-advertising overview report-ads_ad_overall development and implementation.wmv

07. Pull new activity effect analysis report.wmv

08. Development and realization of the effect analysis of preferential activities.wmv

Comprehensive Project-Module 1-Data Warehouse-day09 

01. Data migration tool sqoop-installation-import mysql to hdfs.wmv

02. Data migration tool sqoop-import mysql to hdfs-specify conditions-incremental import-free query.wmv

03. Data migration tool-sqoop- import mysql data to hive.wmv

04. Data migration tool-sqoop-export data to mysql.wmv

05. Business data analysis-data migration-user_info import script development.wmv

06. Data migration-script development-sales analysis-modeling design.wmv

07. Order analysis-turnover analysis report-ads_order_amt_cube.wmv

08. Order analysis--GMV multi-dimensional analysis report.wmv

09. Order analysis-category analysis report.wmv

 

 

 

Comprehensive Project-Module 2 - User Portrait - Day01

  1. Analysis of Big Data Applications in Various Industries
  2. User portrait project background introduction-label system analysis
  3. User Portrait Project--Data Introduction--DSP Business Department Data
  4. User portrait project-data introduction-company internal data-DSP business department data
  5. User portrait project-data introduction-cloud operator traffic data
  6. Analysis of the overall process of user portrait project development
  7. Introduction to the core concepts of graph computing-graph-point-edge-directed-ring-degree-connected subgraph-point edge data structure
  8. An introduction to graph computing-finding connected subgraphs
  9. An Introduction to Graph Computing--Find Connected Subgraphs (2)

Comprehensive Project-Module 2 - User Portrait - Day02

  1. Graph computing entry case contact 2
  2. Project development-id mapping dictionary-issue requirements-calculation process analysis
  3. Project development-id mapping dictionary construction (T day initial construction)
  4. Project development-id mapping dictionary construction (T+1 day)(1)

Comprehensive Project-Module 2 - User Portrait - Day03

  1. id-mapping code implementation (2)-group id adjustment
  2. Id-mapping code implementation (3)-transform into calculation of real data
  3. dsp data preprocessing development (1)
  4. Analysis of the overall structure of the integrated project (1)
  5. User portrait-dsp log preprocessing-code implementation
  6. User portrait-dsp extra-kpi report statistics
  7. User portrait-dsp extra-kpi report statistics (sql implementation version)-dataframe write mysql

Comprehensive Project-Module 2 - User Portrait - Day04

  1. User portrait-doit traffic log preprocessing
  2. User portrait-doit traffic log preprocessing (2)
  3. User portrait-cmcc traffic log processing-crawler background introduction
  4. Introduction to crawlers-jsoup function introduction-JD outdoor category crawler examples (1)
  5. Getting started with crawlers-JD outdoor category crawling development (2)

Comprehensive Project-Module 2 - User Portrait - Day05

  1. User portrait-preprocessing-cmcc traffic log preprocessing
  2. User Portrait-Tag Extraction-Tag Structure Review-Tag Programming Model Design
  3. User portrait-label extraction-analysis of label calculation strategy process
  4. User portrait-DSP label extraction-label score statistics
  5. User portrait-DSP label extraction-label gathering by person (1)
  6. User portrait-DSP label extraction-labels gathered by gid (1)
  7. User Portrait-DOIT Label Extraction-Duoyi Label-Data Warehouse Statistics
  8. User portrait-DOIT tag extraction-Duoyi tag-log data extraction
  9. User Portrait-DOIT Label Extraction-Duoyi Label-Data Warehouse Report Data Extraction

Comprehensive Project-Module 2 - User Portrait - Day06

  1. User portrait-cmcc label extraction
  2. User portrait-multi-source label aggregation and merging-multi-layer map assembly
  3. User portrait-multi-source tag aggregation merge-tag bean merge-bean to json
  4. User Portrait-Two-day Label Attenuation Merger-Requirements Description-Process Design
  5. User portrait-two-day label attenuation merge-code implementation-label jsonization

 

 

Recommendation algorithm

Comprehensive Project-Module 3 - Recommendation Algorithm - Day01

  1. Introduction to Recommendation System-Popularity Recommendation-Portrait Recommendation-Algorithm Recommendation
  2. Introduction to machine learning algorithms--knn classification-kmeans clustering-supervised learning-unsupervised learning-semi-supervised learning
  3. The core foundation of machine learning algorithms-feature vector model (sparse vector-dense vector)
  4. The core foundation of machine learning algorithm-actual case of item vectorization (1)
  5. CB recommendation-recommendation algorithm based on content similarity-to achieve the overall process architecture
  6. NLP algorithm model-TF-IDF feature value calculation-text vectorization
  7. NLP algorithm model-TF-IDF text vectorization actual combat
  8. Classification algorithm--Naive Bayesian teaching ideas and formula derivation
  9. Classification algorithm-Naive Bayes-model training and prediction code implementation
  10. Classification algorithm-Naive Bayes-model training and prediction code implementation
  11. Project combat-Naive Bayes classification of comment data sets

Comprehensive Project-Module 3 - Recommendation Algorithm - Day02

  1. Process review based on content similarity recommendation calculation
  2. Recommendation based on content similarity-code implementation (1)
  3. Recommendation based on content similarity-code implementation (2)
  4. Recommendation based on content similarity-code implementation (3)
  5. Collaborative filtering recommendation algorithm-principle of algorithm idea
  6. Collaborative filtering algorithm-algorithm code implementation-result display
  7. Model label calculation-Churn rate label-Naive Bayes application-vector normalization

 

 

Comprehensive Project-Module 4 - Flink Real-Time Computing- Day01

  1. Flink knowledge review
  2. Flink restart strategy
  3. Flink restarts strategy test
  4. Flink integrates KafkaSource
  5. Flink integrates KafkaSource to achieve Exactly-Once
  6. Flink integrates RedisSink
  7. Customize MysqlSink

Comprehensive Project-Module 4 - Flink Real-Time Computing- Day02

  1. Flink content review
  2. Submit Flink tasks in the cluster
  3. Flink's StandAlone execution process
  4. Detailed explanation of FlinkOnYarn execution process
  5. Flink's stage division principle
  6. Flink recovers data from checkpoint
  7. Project engineering initialization
  8. FlinkUtils tool class package

Comprehensive Project-Module 4 - Flink Real-Time Computing- Day03

  1. review
  2. Real-time computing business architecture
  3. Real-time computing business architecture upgrade
  4. Nginx installation
  5. Installation of OpenResty
  6. Log collection server
  7. Talk about Nginx data collection into Kafka
  8. Log collection data test
  9. Data real-time ETL
  10. Flow measurement output
  11. Realization of flow measurement output
  12. Customize RedisSink
  13. Multi-dimensional statistics of participation times

Comprehensive Project-Module 4 - Flink Real-Time Computing- Day04

  1. Knowledge review
  2. Real-time project structure combing
  3. Introduction to canal
  4. Installation and use of canal
  5. Order data analysis requirements
  6. Flink statistics order data
  7. Flink window delay join

Comprehensive Project-Module 4 - Flink Real-Time Computing- Day05

  1. Flink obtains the lost data of the window through the flow measurement output
  2. Left join and get delayed data
  3. Flink two streams join
  4. Realized in order and schedule
  5. Project knowledge point review
  6. ProtoBuffer combined with Flink optimization

Guess you like

Origin blog.csdn.net/sinat_40775402/article/details/98846377