CS194 Full Stack Deep Learning(1) Setting up Machine Learning Projects

0. Preface

1. Overview

  • There is a big gap between machine learning projects and ordinary software engineering projects. It is difficult to judge which tasks are difficult and which are easy.
    • image-20210210030138259
  • A report indicated that 85% of machine learning projects failed, mainly due to:
    • Machine learning is still under research and it is difficult to achieve a 100% success rate.
    • But some projects are doomed to fail because of:
      • Technically not feasible
      • Unable to transplant from academia to industry
      • There is no clear standard for success or failure, and there is no clear Mubao
      • Chaotic team management
  • The follow-up content of this course uses robot pose estimation as the application scenario

2. Lifecycle

  • Introduce the life cycle of a machine learning project (several stages and the relationship between the stages)

  • Machine learning projects are not pipelines, and each stage generally requires repeated tasks.

  • four stages

    • Planning & project setup: Decide what to study, what goals are expected to be achieved, and how many resources are needed to complete the project
    • Data collection & labeling: Determine the target object to be trained, set up sensors (such as a camera) to collect data, and label the data
    • Training & debugging: Use opencv to implement some baselines, review the literature to understand the latest and strongest technologies and reproduce them, and improve the model
    • Deploying & testing: Deploy in a laboratory environment, test to prevent degradation (it means to write a log system, we can review and locate these problems when problems occur), and deploy in a production environment
  • The relationship between the various stages

    • Data -> Planning: Data is found to be too difficult to obtain, or too difficult to label. Other easier labeling methods can be used to solve the problem
    • Training -> data: More data is needed for over-fitting, and the data is found to be unreliable (label quality is not good)
    • training -> planning: It is found that the plan is difficult to achieve, and it is found that it is difficult to meet other requirements at the same time (for example, it needs to be fast and good)
    • deploying -> training: Insufficient performance in the laboratory environment, the model needs to be further improved
    • deploying -> data: It is found that the training data does not match the test scenario data (we have some assumptions before, but the assumptions in the actual scenario are not true), and the data needs to be re-collected to find some difficult scenarios (mine hard cases)
    • deploying -> planning: The performance indicators used are inappropriate (cannot provide help for downstream users), and the expected performance cannot be achieved in the real world (we need to review our requirements)
  • The overall content is as follows:

    • The one on the right is the activity in a single machine learning project
    • Yesterday was shared by multiple machine learning projects, mainly how to build a team and general tools.

image-20210210031818712

  • What else needs to be understood
    • Understand the current best situation in the field, such as what can be done, and what areas can be studied in the future
    • Which research fields are more promising and promising now
  • Q&A
    • How to research SOTA: For a new field, look for 1-2 landmark jobs, the jobs they cite, cite their work (list by citation)
    • How to communicate machine learning project-related content to senior management: It is difficult to communicate because machine learning projects are very different from ordinary engineering projects. The latter is time-controllable, and the former never knows what can be done or what can’t be done.

3. Prioritizing

  • main content
    • Which machine learning project to choose
    • Evaluate the cost and feasibility of machine learning projects
  • The key points are:
    • How to find influential machine learning projects, among which cheap prediction is very valuable
    • The cost of the project mainly lies in the difficulty of data acquisition, but also the accuracy requirements
  • Project priority coordinate system
    • Establish a coordinate system through "Impact" and "Feasibility"
    • Prioritize projects with high impact and high feasibility

image-20210220210313694

  • Mental models for high-impact machine learning projects
    • Mental models I do not understand what it meant, Baidu a bit like the concept of psychology, Chinese is the mental model .

      • I guess it means what characteristics a high-impact machine learning project should have.
    • The view of the book The economics of AI:

      • AI reduces the cost of "prediction", and prediction is the key to decision-making
      • cheap prediction (cheap prediction), which reduces the cost of prediction, and now ordinary people who used to be very expensive can also use it (such as autonomous driving)
      • Implication (what kind of project should you look for): Look for cheap prediction and cheap prediction that can have a significant impact.
    • Software 2.0

      • Traditional software paradigm (1.0): There are clear instructions (instructions/rules)
      • Software 2.0: First, people determine the goals, and then use algorithms to find a program to achieve the above goals.
        • Are all related to the data set, which get compiled via optimization
      • Implications (what kind of project should you look for): rule-based projects where related functions can be learned through models, but not through programming.
  • Assess the feasibility of the project

image-20210225205110282

  • The relationship between model accuracy and cost
    • The cost increases exponentially with the accuracy of the model.
    • How to understand? Some error samples have a low probability of occurrence. To obtain more error samples, a large amount of data is required.
  • How to judge the difficulty of machine learning tasks
    • Andrew Ng has a very loose theory: the work that a person can do in 1 second is extremely capable of doing.
      • Positive examples: image recognition, voice recognition, machine translation, grabbing objects
      • Counter-examples: understanding irony/humor, quiet movements on hands, generalization to new scenes
    • Which tasks are difficult in machine learning
      • Unsupervised learning, reinforcement learning
      • It works well in some scenes (with massive data), but the overall effect is average
    • What is difficult to supervise learning:
      • Answer the questions and summarize the text
      • Predictive video
      • 3D model
      • Real-world speech recognition (noisy)
      • Do math problems
      • word puzzles
    • Which aspects of the task are more difficult
      • The output is complex, the accuracy requirements are particularly high, and the requirements for generalization capabilities are particularly high

image-20210227173509214

4. Aechetypes

  • The main contents include:
    • What are the types of machine learning projects
    • What are the characteristics of each category
  • The problems are mainly divided into three categories, as shown in the figure below
    • Enhance existing functions, such as recommendation system, game AI, code completion
    • Augment manual work, such as sketch conversion
    • Automated manual work, such as autonomous driving

image-20210227204653353

  • Determine the key to each type of problem

image-20210227205205713

  • Data Flysheel: machine learning project data flow in an ideal state

image-20210227210547127

  • The feasibility and impact of the three types of projects, and how to improve
    • Automate a manual process has the highest impact but the most difficult to implement
      • Improvement method: It is best to add people to the dataflywheel during the real-time process, or add some restrictions to the real-time environment of the entire project
    • Augment a manual process has a high impact and real-time difficulty
      • Improvement method: Need to better design product details, first publish a better model and continue to optimize, for example, facebook can detect faces in pictures and let users judge whether there is a mark error
    • Improve an existing process has the lowest impact, but the real-time difficulty is also less
      • Improvement method: To create a data loop, continuously improve the accuracy of the model

5. Metrics

  • The main content is how to choose performance optimization parameters. The key lies in:
    • There is a paradox: the real world is complex and requires multiple indicators, but it is best to use only one indicator for model training.
    • Therefore, a criterion for selecting performance indicators needs to be determined. This criterion can change or change
  • There are generally three methods of mixing multiple indicators
    • Average (arithmetic average or weighted average): such as precision and recall average
    • Threshold n-1 metrics, evaluate the nth: Some performance indicators should be higher than the threshold, and a certain performance indicator should be as good as possible.
      • How to choose the threshold indicator:
        • domain judgement(e.g., which metrics can you engineear around?)
        • which metrics are least sensitive to model choice
        • Those indicators and and almost met expectations
      • How to choose the threshold value:
        • domain judgement(e.g., what is an acceptable tolerance downstream? What performance is achievable?)
        • How does the baseline model perform
        • How important is it now
    • More complex/domain-specific formula

6. Baselines

  • Benchmarks, to judge whether the effect of the model we trained is good or not.
  • Why benchmarks are needed
    • Lower bound of expected result
      • The better the baseline performance, the greater the effect
    • After training the model, compare different benchmarks and follow-up work arrangements are also different

image-20210228185217318

  • How to determine the benchmark

    • External benchmarks: business requirements, public results (to make sure that the public results are the same event as what we need)
    • Internal benchmarks: simple ML algorithms (such as K proximity, etc.), simple rules, human prediction results
  • How to build benchmarks in practical applications

image-20210228185427278

  • In addition, it is recommended not to skip the "benchmark" step even if time is tight, but you can spend less time in this step.

Guess you like

Origin blog.csdn.net/irving512/article/details/114236519