Big Data: Collection of Interview Questions (1)

Describe the advantages of hadoop2.0 over hadoop1.0

https://blog.csdn.net/WYpersist/article/details/79951569

Hadoop commands

1.  Kill a job

kill -9 process id

2.  Delete the /tmp/bbb directory deleted by hdfs

  hadoop dfs -rm r /tmp/bbb

3. Adding a new storage node and deleting a computing node requires the refresh cluster status command

https://blog.csdn.net/iwantknowwhat/article/details/50822316

What should I do if the namenode of Hadoop is down?

https://blog.csdn.net/wypersist/article/details/79953718

programming questions

There is also a large amount of log data stored in a super large file, which cannot be directly read into the memory, and it is required to extract the IP with the most visits to Baidu on a certain day.

 

Mapreduce data skew reasons and solutions

https://blog.csdn.net/wypersist/article/details/79797075

Spark task execution speed skew problem solution

https://blog.csdn.net/lsshlsw/article/details/52025949

Briefly describe the common performance bottlenecks and optimization methods of Hbase

https://blog.csdn.net/wypersist/article/details/79954490

Briefly describe the basic process of running applications in yarn

https://www.cnblogs.com/yurunmiao/p/4494582.html

 

Step 1 : The user submits the application program to YARN , including the ApplicationMaster program, starting the ApplicationMaster , and the user program.

Step 2 : The ResourceManager assigns the first Container to the application , and communicates with the corresponding NodeManager , asking it to start the ApplicationMaster of the application in this Container .

Step 3 : The ApplicationMaster first registers with the ResourceManager , so that the user can view the running status of the application directly through the ResourceManager , and then it will apply for resources for each task and monitor its running status until the end of the operation, that is, repeat steps 4-7 .    

Step 4 : The ApplicationMaster uses the polling method to find the ResourceManager to apply for and receive resources through the RPC protocol . 

Step 5 : Once the Application applies for the resource, it communicates with the corresponding NodeManager and requests to start the task.

Step 6 : NodeManager sets the running environment for the task, including environment variables, JAR packages, binary programs, etc., and then writes the task startup command to another script, and starts the task by running the script.

Step 7 : Each task reports its status and progress to the ApplicationMaster through the RPC protocol , and the ApplicationMaster keeps track of the running status of each task , so that the task can be restarted when the task fails. During the running process of the application, the user can use the RPC protocol at any time     

ApplicationMaster queries the current running state of the application.

Step 8: After the application is completed, the ApplicationMaster logs out of the ResourceManager and shuts itself down.

List common performance problems and solutions of redis

https://blog.csdn.net/tanga842428/article/details/52764608

Briefly describe JVM principles and tuning

JVM Knowledge Q&A Collection

https://blog.csdn.net/GV7lZB0y87u7C/article/details/79662413

How the servers in the Zookeeper cluster communicate with each other

The communication between the follower and the leader is mainly because the follower receives commands such as ( create, delete, setData, setACL , createSession, closeSession, sync ) that require the leader to coordinate the final result, which will cause the follower and the leader to generate communication. Due to the one-to-many relationship between the leader and the follower , it is very suitable for the client/server mode. Therefore, the c/s mode is used between them. The leader creates a socket server to monitor the coordination requests of each follower .

Zookeeper election mechanism

Briefly describe the high availability design of general Internet architecture

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324540071&siteId=291194637