Basic interview questions for big data development engineers

Hadoop
1. Composition
2. Hdfs file upload
3. Hdfs file download
4. MR process
5. Combine in
MR 6. Yarn running process
7. Yarn resource scheduling type
8. Zookeeper function
9. Zookeeper master-slave selection mechanism
10. Mr program wordcount
11. Cache chickpoint in Mr

Spark
2. Spark resource scheduling process
3. Spark running process
4.
Shuffle 5. Spark common operators
6. Cache catch and chickpoint
7. Spark program wordcount
8. Spark tuning
9. The difference between Spark and MR

Flume
1. Transaction
2. Source
3.
Channal
4. Sink 5. Configuration in the project

Linux
1. Commonly used commands
2. Script writing

Kafka
1. Composition
2. Ensure that data does not lose data
3. Accurate data is only consumed once
4. Reasons for Kafka's speed

Hive
1. Architecture
2. Hivesql underlying conversion MR process
3. Internal and external tables
4. Table building method
5. Import data
6. Export data
7. Partition
8. Bucket
9. Custom function UDF, UDAF, UDTF
10. Order by , sort by, distribute by, cluster by
11. Difference between Rank() and dense_rank()
12. String concatenation concat() concat_ws()
13. Timestamp date conversion
14. Split string substr() and split( )[]
15. Sql tuning
16. Data tilt and solutions
17. Parameter tuning
18. Compression format
19. Execution plan explain
20. Data tilt positioning sql position
21. Handwritten sql topN
22. Handwritten sql accumulation
23. Handwritten sql Continuous indicators
24. Handwritten sql row and column conversion

Algorithm
1. Bubble sort
2. Quick sort
3. Merge sort
4. Binary search

Redis
1. Concept
2. Data type

Scqoop
1. Null value problem
2. Full incremental import and incremental merge

Project
1. Data preprocessing 2.
Idmapping
3. Data warehouse modeling process
4. The meaning of data warehouse layering
5. Zipper table
6. Data volume
7. Data life cycle
8. Data governance
9. Cluster
10. Number of team members and division of labor
11. Project highlights
12. Problems encountered and solutions
13. The entire processing process of one indicator of the log data traffic field
14. The entire processing process of one indicator of business data
15. The timing task will automatically send an alarm and send an email.

Guess you like

Origin blog.csdn.net/weixin_47699191/article/details/115278852