Multiple choice questions on HBase in big data

1. Multiple choice questions (9 questions in total, 49.5 points)

  1. (Single choice question) Which of the following descriptions about BigTable is wrong?
    A. The crawler continuously crawls new pages, and these pages are stored in BigTable at regular intervals.
    B. BigTable is a distributed storage system.
    C. BigTable Originally used to solve typical Internet search problems
    D. Internet search applications query the established index and obtain web pages from BigTable.
    Correct answer: A: The crawler continuously crawls new pages, and these pages are stored in BigTable at regular intervals;

  2. (Single choice question) Among the following options, which one is wrong about the underlying technology correspondence between HBase and BigTable?
    A. GFS corresponds to HDFS
    B. GFS corresponds to Zookeeper
    C. MapReduce corresponds to Hadoop MapReduce
    D. Chubby corresponds to Zookeeper corresponds to
    the correct answer: B:GFS corresponds to Zookeeper;

  3. (Single choice question) In HBase, which of the following is wrong about the description of data operations?
    A. HBase uses a simpler data model, which stores data as uninterpreted strings.
    B. HBase operations do not There are complex relationships between tables.
    C. HBase does not support modification operations.
    D. HBase is designed to avoid complex relationships between tables.
    Correct answer: C: HBase does not support modification operations;

  4. (Single-choice question) In the HBase access interface, where is Pig mainly used?
    A. Suitable for parallel batch processing of HBase table data by Hadoop MapReduce jobs
    B. Suitable for HBase management
    C. Suitable for other heterogeneous systems to access HBase table data online
    D. Suitable for data statistics.
    Correct answer: D: Suitable for data statistics;

  5. (Single choice question) In HBase, a cell needs to be determined based on certain factors. These factors can be regarded as a "four-dimensional coordinate". Which of the following does not belong to the "four-dimensional coordinate"? A.
    Row key
    B. Keyword
    C. Column family
    D. Timestamp
    Correct answer: B: Keyword;

  6. (Single choice question) Which statement is wrong about the names and functions of each level in the three-tier structure of HBase?
    A. The Zookeeper file records the Region location information of the user data table
    B. The -ROOT- table records .META. Region location information of the table
    C. .META. The table saves the Region location information of all user data tables in HBase
    D. The Zookeeper file records the location information of the -ROOT- table
    Correct answer: A: The Zookeeper file records the Region of the user data table location information;

  7. (Single choice question) Which of the following descriptions about the main server Master is mainly responsible for the management of tables and Regions is wrong?
    A. After the Region is split or merged, it is responsible for re-adjusting the distribution of the Region.
    B. For the failed Region server. Migrate the Region on the
    C. Manage users' operations such as adding, deleting, modifying, and querying tables.
    D. Do not support load balancing between different Region servers.
    Correct answer: D: Do not support load balancing between different Region servers;

  8. (Single choice question) HBase has only one index for row keys. If you want to access rows in the HBase table, which of the following methods is not feasible?
    A. Access through a single row key
    B. Access through a timestamp
    C. Through a row Access the healthy interval
    D. Full table scan
    Correct answer: B: Access through timestamp;

  9. (Single choice question) Which of the following statements about Region is wrong?
    A. The same Region will not be split into multiple Region servers
    B. In order to speed up access, all Regions of the .META. table will be saved in memory Medium
    C. A -ROOT-table can have multiple Regions
    D. In order to speed up addressing, the client will cache location information. At the same time, the cache invalidation problem needs to be solved
    . Correct answer: C: A -ROOT-table can have multiple Regions;

2. Multiple choice questions (9 questions in total, 50.5 points)
10. (Multiple choice questions) Relational databases have been popular for many years, and Hadoop already has HDFS and MapReduce. Why is HBase needed?
A. Hadoop can solve large-scale problems very well The problem of offline batch processing of data, however, is limited by the high-latency data processing mechanism of the Hadoop MapReduce programming framework, making Hadoop unable to meet the needs of large-scale data real-time processing applications.
B. HDFS is oriented to batch access mode, not random access mode
C. Traditional general-purpose relational databases cannot cope with system scalability and performance problems caused by dramatic increases in data size.
D. Traditional relational databases generally need to be shut down for maintenance when the data structure changes; empty columns waste storage space
. Correct answer: ABCD: Hadoop can be very It can effectively solve the problem of offline batch processing of large-scale data. However, it is limited by the high-latency data processing mechanism of the Hadoop MapReduce programming framework, making Hadoop unable to meet the needs of large-scale data real-time processing applications; HDFS is oriented to batch access mode, not random Access mode; traditional general-purpose relational databases cannot cope with system scalability and performance problems caused by dramatic increases in data size; traditional relational databases generally require shutdown maintenance when the data structure changes; empty columns waste storage space;

  1. (Multiple choice question) The difference between HBase and traditional relational databases is mainly reflected in the following aspects?
    A. Data type
    B. Data operation
    C. Storage mode
    D. Data maintenance
    Correct answer: ABCD: Data type; Data operation; Storage schema; data maintenance;

  2. (Multiple choice question) What types of HBase access interfaces include?
    A. Native Java API
    B. HBase Shell
    C. Thrift Gateway
    D. REST Gateway
    Correct answer: ABCD: Native Java API; HBase Shell; Thrift Gateway; REST Gateway;

  3. (Multiple choice question) Which of the following descriptions of the data model are correct?
    A. HBase uses tables to organize data. The tables are composed of rows and columns, and the columns are divided into several column families
    . B. Each HBase table consists of several rows Composed, each row is identified by a row key (row key)
    C. The data in the column family is located by column qualifier (or column)
    D. Each cell saves multiple versions of the same data. These versions Using timestamps for indexing
    Correct answer: ABCD: HBase uses tables to organize data. The tables are composed of rows and columns, and the columns are divided into several column families; each HBase table is composed of several rows, and each row is composed of a row key (row key); the data in the column family is located by the column qualifier (or column); each cell stores multiple versions of the same data, and these versions are indexed by timestamps;

  4. (Multiple choice question) What are the three main functional components included in the implementation of HBase?
    A. Library function: linked to each client
    B. A Master server
    C. Many Region servers
    D. Cheap computer clusters
    Correct answer: ABC :Library function: linked to each client; a Master server; many Region servers;

  5. (Multiple choice question) In the three-layer structure of HBase, which three layers do the three layers refer to?
    A. Zookeeper file
    B. -ROOT-table
    C. .META.Table
    D. Data type
    Correct answer: ABC: Zookeeper file; - ROOT-table; .META.table;

  6. (Multiple choice question) Which of the following software can perform performance monitoring on HBase?
    A. Master-status (self-contained)
    B. Ganglia
    C. OpenTSDB
    D. Ambari
    Correct answer: ABCD: Master-status (self-contained); Ganglia; OpenTSDB; Ambari;

  7. (Multiple choice question) Which of the following descriptions about the working principle of the Region server are correct?
    A. Each Region server has its own HLog file
    B. Each flash generates a new StoreFile, which is too large and affects Search speed
    C. The merge operation consumes resources, and the merge is started only when the number reaches a threshold.
    D. Store is the core of the Region server.
    Correct answer: ABCD: Each Region server has its own HLog file; each flush generates one There are too many new StoreFiles, which affects the search speed; the merge operation consumes resources, and the merge is only started when the number reaches a threshold; Store is the core of the Region server;

  8. (Multiple choice question) Which of the following descriptions of the working principle of HLog are correct?
    A. In a distributed environment, system errors must be considered. HBase uses HLog to ensure
    B. The HBase system configures an HLog file for each Region server.
    C. Zookeeper will monitor the status of each Region server in real time.
    D. Master will first process the HLog file left on the failed Region server
    . Correct answer: ABCD: Distributed environments must consider system errors. HBase adopts HLog guarantee; HBase system configures an HLog file for each Region server; Zookeeper will monitor the status of each Region server in real time; Master will first process the HLog file left on the failed Region server;

Guess you like

Origin blog.csdn.net/m0_74459049/article/details/134504643