Describe the advantages of hadoop2.0 over hadoop1.0
https://blog.csdn.net/WYpersist/article/details/79951569
Hadoop commands
1. Kill a job
kill -9 process id
2. Delete the /tmp/bbb directory deleted by hdfs
hadoop dfs -rm r /tmp/bbb
3. Adding a new storage node and deleting a computing node requires the refresh cluster status command
https://blog.csdn.net/iwantknowwhat/article/details/50822316
What should I do if the namenode of Hadoop is down?
https://blog.csdn.net/wypersist/article/details/79953718
programming questions
There is also a large amount of log data stored in a super large file, which cannot be directly read into the memory, and it is required to extract the IP with the most visits to Baidu on a certain day.
Mapreduce data skew reasons and solutions
https://blog.csdn.net/wypersist/article/details/79797075
Spark task execution speed skew problem solution
https://blog.csdn.net/lsshlsw/article/details/52025949
Briefly describe the common performance bottlenecks and optimization methods of Hbase
https://blog.csdn.net/wypersist/article/details/79954490
Briefly describe the basic process of running applications in yarn
https://www.cnblogs.com/yurunmiao/p/4494582.html
Step 1 : The user submits the application program to YARN , including the ApplicationMaster program, starting the ApplicationMaster , and the user program.
Step 2 : The ResourceManager assigns the first Container to the application , and communicates with the corresponding NodeManager , asking it to start the ApplicationMaster of the application in this Container .
Step 3 : The ApplicationMaster first registers with the ResourceManager , so that the user can view the running status of the application directly through the ResourceManager , and then it will apply for resources for each task and monitor its running status until the end of the operation, that is, repeat steps 4-7 .
Step 4 : The ApplicationMaster uses the polling method to find the ResourceManager to apply for and receive resources through the RPC protocol .
Step 5 : Once the Application applies for the resource, it communicates with the corresponding NodeManager and requests to start the task.
Step 6 : NodeManager sets the running environment for the task, including environment variables, JAR packages, binary programs, etc., and then writes the task startup command to another script, and starts the task by running the script.
Step 7 : Each task reports its status and progress to the ApplicationMaster through the RPC protocol , and the ApplicationMaster keeps track of the running status of each task , so that the task can be restarted when the task fails. During the running process of the application, the user can use the RPC protocol at any time
ApplicationMaster queries the current running state of the application.
Step 8: After the application is completed, the ApplicationMaster logs out of the ResourceManager and shuts itself down.
List common performance problems and solutions of redis
https://blog.csdn.net/tanga842428/article/details/52764608
Briefly describe JVM principles and tuning
JVM Knowledge Q&A Collection
https://blog.csdn.net/GV7lZB0y87u7C/article/details/79662413
How the servers in the Zookeeper cluster communicate with each other
The communication between the follower and the leader is mainly because the follower receives commands such as ( create, delete, setData, setACL , createSession, closeSession, sync ) that require the leader to coordinate the final result, which will cause the follower and the leader to generate communication. Due to the one-to-many relationship between the leader and the follower , it is very suitable for the client/server mode. Therefore, the c/s mode is used between them. The leader creates a socket server to monitor the coordination requests of each follower .