Big data technology written test question bank--with answers

1. Single choice:

1. Among the following options, which command to execute to view the IP configuration of the Linux system. A

A、ipconfig

B、find

C、ifconfig

D、arp -a

2. In the MapReduce program, the data format received by the map() function is (D).

A. String

B. Integer

C、Long

D. Key-value pairs

3. Among the following options, the correct statement about the architecture of HDFS is (B).

A. HDFS adopts the active-standby architecture

B. HDFS adopts a master-slave architecture

C. HDFS adopts the slave architecture

D. All the above statements are wrong

4. Among the following options, the stage mainly used to determine the performance of the entire MapReduce program is (D).

A、MapTask

B、ReduceTask

C. Sharding and formatting data sources

D、Shuffle

5. Among the following options, the Shell command for uploading files is (D).

A、-ls

B、-mv

C-cp

D、-put

6. Blocks in HDFS save (A) copies by default.

A. 3 copies

B. 2 copies

C. 1 copy

D. not sure

7. Among the following options, if any node is shut down, the Hadoop cluster (A) cannot be accessed.

A, namenode

B、datanode

C、secondary namenode

D、yarn

8. Among the following options, the unique process of Hadoop2.x version is (C).

A、JobTracker

B、TaskTracker

C、NodeManager

D、NameNode

9. During the startup process of the Hadoop2.0 cluster service, the following options do not include (B).

A, name node

B、JobTracker

C、DataNode

D. ResourceManager
10. Among the following options, the directory for storing Hadoop configuration files is (D).

A、include

B、thousand

C、libexec

D、etc

11. In Hadoop1.0, the main components of the Hadoop kernel are (A).

A、HDFS和MapReduce

B. HDFS and Yarn

C、Yarn

D. MapReduce and Yarn

12. In the Combine stage of MapTask, when all data is processed, MapTask will process all temporary files once (B).

A. Fragmentation operation

B. Merge operation

C. Formatting operation

D. Overflow write operation

13. The size of a gzip file is 75MB, the client sets the block size to 64MB, and the number of occupied blocks is (B).

A、1

B、2

C、3

D、4

14. Among the following options, which one is the most important significance of studying big data (D).

A. Analysis

B. Statistics

C. to test

D. Forecast

15. When Hive defines a custom function class, which of the following classes should be inherited from? (B)

A、FunctionRegistry

B、UDF

C、MapReduce

16. The most important performance of Hive is scalability, extensibility, (B) and loose matching of input formats.

A. Less restorative

B. Fault tolerance

C. Quick query

D. Can handle large amounts of data

17. In the Hadoop decompression directory, which command can be executed to view the Hadoop directory structure. (B)

A、jps

B、ll

C、tar

D、find

18. Among the following options, the statement about HDFS is wrong (D).

A. HDFS is one of the cores of Hadoop

B. HDFS originated from Google's GFS paper

C. HDFS is used to store massive big data

D. HDFS is used to calculate massive big data

19. In the order of granularity, Hive data is divided into: database, data table, (C), bucket.

A. Ancestor

B. column

C. Partition

D. OK

20. In HDFS, the node used to save data is (B).

A, namenode

B、datanode

C、secondaryNode

D、yarn

21. Which of the following is usually the most important performance bottleneck of the cluster? ( C )

A、CPU

B. Internet

C. Disk

D. memory

22. Which of the following options is not part of the Hive system architecture? ( C )

A. User interface

B. Cross-language service

C、HDFS

D. The underlying drive engine

23. One difference between the Hive query language and SQL is the (C) operation.

A、Group by

B、Join

C、Partition

D、Union

24. What is the key syntax for Hive to load data files into data tables? (A)

A、LOAD DATA [LOCAL] INPATH filepath [OVERWRITE] INTO TABLE tablename

B、INSERTDATA [LOCAL] INPATH filepath [OVERWRITE] INTO TABLE tablename

C、LOAD DATA INFILE d:\car.csv APPEND INTO TABLE t_car_temp FIELDS TERMINATED BY “,”

D、INSERTDATA [LOCAL] INFILE d:\car.csv APPEND INTO TABLE t_car_temp FIELDS TERMINATED BY “,”

25. Which of the following options can format the Hadoop cluster (A).

A、hadoop namenode -format

B、hadoop namenode -ls

C、hdfs datanode -ls

D、hdfs datanode -format

26. Among the following options, the command to start the HDFS cluster with one key is (C).

A、start-namenode.sh

B、start-datanode.sh

C、start-dfs.sh

D、start-slave.sh

27. Which one is correct about SecondaryNameNode? (C)

A. It is the hot standby of NameNode

B. It has no requirements for memory

C. Its purpose is to help NameNode merge edit logs and reduce NameNode startup time

D. SecondaryNameNode should be deployed to a node with NameNode

28. Among the following options, which configuration file can configure HDFS address, port number and temporary file directory (A).

A、core-site.xml

B、hdfs-site.xml

C、mapred-site.xml

D、yarn-site.xml

29. Among the following statements, the statement about the client reading data from HDFS is wrong (C).

A. The client will select the top-ranked DataNode to read the Block in sequence

B. The client will merge all the Block blocks that are finally read into a complete final file

C. The client will select the DataNode with the lower order to read the Block block

D. If the client itself is a DataNode, then the data will be obtained directly from the local

30. After the Hadoop cluster starts successfully, the port used to monitor the HDFS cluster is (D).

A、50010

B、50075

C、8485

D、50070

31. Which of the following statements is incorrect? (D)

A. The data source is the foundation of the data warehouse, which usually contains various internal and external information of the enterprise.

B. Data storage and management are the core of the entire data warehouse.

C. The OLAP server reorganizes and analyzes the data that needs to be analyzed according to the multidimensional data model, and discovers data laws and trends.

D. The main function of the front-end tool is to visualize the data on the front-end page.

32. Among the following options, the method used to delete folders on HDFS is (A).

A、delete()

B、rename()

C、mkdirs()

D、copyToLocalFile()

33. Each Map task has a memory buffer, the default size is (C).

A、128M

B、64M

C、100M

D、32M

34. When Hive builds a table, the field type of the numeric column is the difference between decimal (x, y) and Float, double. Which of the following statements is correct? (B)

A. decimal (x, y) is an integer, Float, double is a decimal

B. When Float and double perform aggregate operations such as sum, there will be JAVA precision problems

C. decimal (x, y) is a numerical interception function, and Float and double are data types

35. Among the following options, the correct statement about the SSH service is (D).

A. SSH service is a transport protocol

B. SSH service is a communication protocol

C. SSH service is a packet protocol

D. SSH service is a network security protocol

36. Among the following options, which type of conversion is not supported by the Hive query language? (D)

A、Double—Number

B、BigInt—Double

C、Int—BigInt

D、String—Double

Guess you like

Origin blog.csdn.net/m0_50736744/article/details/131776269