Medium body

Big Data Interview Questions

1. HDFS client first which program below obtains data information ()

a）NameNode b）Jobtracker c）Datanode d）secondaryNameNode e）tasktracker

2. The block in HDfS saves several copies by default ()

a) 3 copies b) 2 copies c) 1 copy d) uncertain

3. Which of the following programs is usually started on the same node as the NameNode ()

a）SecondaryNameNode b）DataNode cTaskTracker dlobtracker

4. Which of the following is correct when uploading files on the client side ()

a) Several countries are passed to DataNode via NameNode

b) The client side divides the file into blocks. Upload in turn

c) Client only uploads data to one DataNode, and then the NameNode is responsible for block replication

5. The following framework similar to HDFS is ()

a）TFS b）T32

c）GFS d）EXT3

6. Which of the following is usually the most important bottleneck of the cluster ()

a) CPU b) Network c) Disk IO d) Memory

7. Which item on SecondaryNameNode is correct ()

a) It is a hot standby of NameNode

b) It has no requirements for memory

c) Its purpose is to help NameNode merge edit logs and reduce NameNode startup time

d) The SecondaryNameNode should be deployed to the same node as the NameNode

8. Which of the following is correct for configuring rack awareness ()

a) If a rack has a problem, it will not affect data read and write

b) When writing data, it will be written to DataNodes in different racks

c) MapReduce will obtain network data closer to itself according to the rack

9. Which paper does HBase come from ()

A）TheGoogle File System

B）MapReduce

C）BigTable

D）Chubby

10. The bottom layer of HBase data storage is ()

A） HDFS

B）Hadoop

C）Memory

D）MapReduce

11 HBase message communication mechanism is ()

A）Zookeeper

B）Chubby

C）RPC

D）Socket

12 The options below correctly describe the characteristics of HBase ()

A) High reliability B) High performance C) Column-oriented D) Scalable

13 LSM means ()

A) Log structure merge tree

B) Binary tree

C) Balanced binary tree

D) Binary Tree on Changping Street

14. The following description of the LSM structure is correct ()

A) Sequential storage

B) Write directly to the hard disk

C) Need to flush data to disk

D) is a search balance tree

15 The data of the LSM structure is first stored in ()

A) On the hard disk

B) In memory

C) Disk array

D) Flash memory

16. DaA in the HFle data format) field is used for ()

A) Store the actual KeyValue data

B) Starting point for storing data

C) Specify the length of the field

D) The starting point of the storage data block

17. The Value part of the KeyVaue data format in the HFile data format is ()

A) String with complex structure

B) String

C) Binary data

D) Compressed data

18. About the description of HBase secondary index, which is correct ()

A) The core is the inverted table

B) The concept of the secondary index corresponds to the "primary" index of Rowkey

C) The secondary index uses a balanced binary tree

D) The secondary index uses the LSM structure

19. Is the following description of Bloom Filter correct ()

A) is a very long binary vector and a series of random mapping functions

B) No miscalculation rate

C) There is a certain miscalculation rate

D) You can delete elements in Bloom Flter

20. Please list the process name and function of hadoop

21. A datanode is down, how to restore a process

22. How to deal with data skew with mapredce

23. The difference between hive internal table and external table, why is it recommended to use external table in production environment

24. Process flow of Spark application execution

Big data training

Big Data Interview Questions

Medium body

Guess you like