Zeus is a complete Hadoop job platform
Zeus supports the entire life cycle of a task, from the debug run of a Hadoop task to the periodic scheduling of a production task
Functionally, it supports:
Debug run of Hadoop MapReduce tasks
Debug run of Hive task
Running Shell Tasks
Visual query and data preview of Hive metadata
Automatic scheduling of Hadoop tasks
Complete document management
Zeus is open source, not only open source technology, but also open source products.
Course introduction: Detailed explanation of Hadoop job platform Zeus
Course Outline:
Introduction to zeus
zeus architecture
Comparison of zeus with other scheduling systems
zeus2 with yarn support
Precautions for using zeus
Follow-up plans for zeus2
【Suitable for groups】:
1. System architect, system analyst, senior programmer, senior developer.
2. The person in charge of data center operation, planning and design involving big data processing.
3. Heads of government agencies, financial insurance, mobile and Internet sources of big data.
4. Project leaders of universities and research institutes involved in big data and distributed data processing.
5. Data warehouse managers, modelers, analysts and developers, system administrators, database administrators, and others interested in data warehouses.
The following is the video process QA:
Is this similar to tws scheduling?
Answer: I don't know much about tws, and I don't really ask for it. It's similar to oozie.
Is Zeus also an open source component of apache? Where is the code hosted?
Answer: It's not from Apache, it's from Ali. The github address is https://github.com/alibaba/zeus
Will the worker continue to execute the job after the master hangs up?
Answer: Worker will kill its own task and then connect to the new Master
What role does zookeeper play in it?
Answer: It is mainly for notification of task failure, not necessary
Does taobao not use this? It hasn't been updated on github for a year? Zeus is mainly doing that task in Ali?
Answer: Taobao has been used as far as I know, the code has not been updated, all have a new version of zeus2: https://github.com/michael8335/zeus2
It seems that Taobao has an open source project tbschedule task scheduling system. What is the difference between this and this?
Answer: tbschedule is also a batch scheduling engine, but zeus is more focused on hadoop
Workers compete for distributed locks, will they deadlock?
Answer: No, atomic operation
Can you give a practical application example of Zeus?
Answer: Many companies use it for hadoop cluster scheduling, the most commonly used are MR and Hive
Is it better to use zeus or zues2?
Answer: This is still based on the actual situation. If it is hadoop1, it is best to use zeus directly. If it is hadoop2, I personally recommend using zeus2
Where is the task list of all currently executing workers stored? If the current master is down, how can the new master get it and re-send tasks?
Answer: Every key point of the task will be recorded in the database, and the new Master can be obtained directly from the database
How does the new Master know all the tasks being executed before, and then issue them?
Answer: The new Master can obtain the executing tasks from the task history table of the database
Does Zeus' management and scheduling of algorithms support the simulation results of sample data? Because the difference between algorithm scenarios and efficiency is still relatively large
Answer: zeus is just a workflow engine, the specific algorithm is its own job implementation
How big is the application scale of zeus in Taobao? Please introduce the background process of Zeus' birth and development.
Answer: The scale of this application is inconvenient to say, the background is mainly to provide friendly scheduling management for hadoop clusters
zeus vs azkaban vs oozie?
Answer: They are all workflow engines of hadoop clusters
Using Zeus' task scheduling to run HiveQL sometimes encounters the situation that the hive table or the jar package cannot be found, but the manual rerun can be executed again. What's going on?
Answer: This is the reason why the environment variable is not configured correctly
Does zeus support yarn? Would like to ask what bugs exist in Zeus 1 now?
Answer: zeus1 does not support, zeus2 supports, the specific bug can be viewed in https://github.com/michael8335/zeus2 wiki
Is there any connection between Zeus' master and yearn's ResourceManager?
Answer: No
When the company uses Zeus task scheduling, there are cases where tasks enter the task queue and are not executed from time to time, and then they can only restart Zeus. This is also a bug of Zeus 1, right?
A: This needs to be analyzed in detail, you can contact me privately
Can you connect to Hadoop 2.4 now? When will hive0.13 be supported
A: No, it is not necessary for the time being