【Basic questions of big data interview】

         Today, the leader has something to do. I was asked to interview people for the first time. I was a little nervous, but I still went to be an interviewer. First, I asked about the recruitment needs, and after clarifying the needs, I can prescribe the right medicine - recruiting big data, and then start the process After the interview, first let the other party introduce himself, introduce the project experience, and then start the technical discussion, saying that it is an interview, but it is actually a technical exchange. Let the leader communicate, let the other party wait for the notification, and finally report the interview situation to the relevant person.

 

1. What are the commonly used components of big data?

 

2. How does storm news flow in and out?

 

3. We know that there are NameNode/DataNode-like master-slave nodes in hadoop. What is the corresponding name in strom?

 

4. After Storm builds the environment, which services need to be started and how to start each service (what is the start command)

 

5. How does Storm deploy a developed program to the environment, that is, I have now developed a packaged program, how to start it (that is, what is the startup command)

 

6. Have you encountered any problems during storm development? How to solve them?

 

7. What are the common ports of zookeeper? What does each do?

 

8. Have you encountered any problems in the development of zookeeper, and how to solve them?

 

9. How does ES retrieve data?

 

10. How to optimize ES?

 

11. How can multiple group bys improve performance?

 

12. What kinds of working nodes does ES have?

 

13. Has ES encountered any problems and how to solve them?

 

14. What is the default port that kafka listens on? How long are messages saved by default? What parts does kafka consist of?

 

15. Kafka is distributed, which attribute distinguishes each node?

 

16. The working principle of kafka is how does data flow in and out?

 

17. Can kakfa messages be consumed repeatedly? What is the name of the shell script that creates the topic?

 

18. Which caching systems have you used? How to avoid cache penetration?

 

19. What is the default port of redis? What data types are there?

 

20. How to optimize SQL statements?

 

21. What are the data cleaning and collection components (ETL, Flume, Sqoop)? Usage scenarios?

 

22. What are the names of the master and slave nodes corresponding to Hbase components?

 

23. Have you encountered any problems during the use of Hbase? How many commands are there to get data?

 

 

 

 

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326568681&siteId=291194637