Beijing University Yinghua Big Data Interview Questions

British China

 

 

1. Multiple choice questions (single or multiple choice)

1. Which program below is responsible for HDFS data storage ()

a)NameNode b)Jobtracker c)Datanode d)secondaryNameNode e)tasktracker

2. Which of the following programs is usually started on the same node as the NameNode ()

a) SecondaryNameNode b) DataNode c) TaskrTracker d) Jobtracker

3. Which of the following is usually the main bottleneck of the cluster ()

a) CPU b) Network c) Disk d) Memory

4. Which is correct about SecondaryNameNode? ()

a) It is a hot backup of Namewode

b) It has no requirements for memory

c) Its purpose is to help NameNode merge edit logs and reduce NameNode startup time

d) The SecondaryNameNode should be deployed to the same node as the NameNode

5. Regarding the difference between hashMap and hashTable, the correct statement is ()

a) Both hashMap and hashTable implement the Map interface

b) HashMap is non-synchronized, while HashTable is synchronized

c) HashTable uses Enumeration, HashMap uses Iterator

d) HashTable directly uses the object's hashcode, HashMap to recalculate the hash value, and uses and instead of modulo

6. The following statement is correct ()

a) For local inner classes, you cannot add any modifiers (public default private protected) before class to modify

b) As long as no constructor without parameters is defined, the JVM will generate a default constructor for the class

c) In the upward transformation, when the parent class and the subclass have ordinary methods with the same name, the member methods of the subclass are accessed

d) The construction method of the class in the singleton mode is modified with private and declared as private, so that the new keyword cannot be used outside the class to create instance objects

7. The description of abstract classes in JAVA is correct ()

a) Abstract classes can be instantiated

B) If a method in a class is declared as abstract, the class must be an abstract class

c) The method of the abstract class must be abstract

d) Declare that the abstract class must have the abstract keyword

8. Which of the following is correct when uploading files on the client side

a) Data is passed to DataNode via NameNode

b) The client side divides the file into blocks and uploads them sequentially

c) Client only uploads data to one Datalode, and then the NameNode is responsible for block replication

9. The following statement is correct ()

a) Hadoop is developed in Java, so MapReduce only supports writing in Java language

b) Hadoop supports random reading and writing of data

c) Ganglia can not only monitor, but also alarm

d) Block Size cannot be modified

10. Which of the following programs can correctly implement the conversion from GBK encoded byte stream to UTF-8 encoded byte stream:
byte[] src,dst;()
a)dst=String.fromBytes(src,"GBK") .getBytes("UTF-8")
b)dst=new String(src,"GBK").getBytes("UTF-8")
c)dst=new String("GBK",src).getBytes()
d) dst=String.encode(String.decode(src,"GBK")),"UTF-8)
11. The following statement is correct ()
a) Slave node needs to store data, so the larger the disk, the better.
b) Hadoop default scheduler strategy is FIFO
c) Mapreduce input split is a block
d) Each node in the cluster should be equipped with RAID, so as to avoid single disk damage, affecting the operation of the entire node
12. The following statements about Kafka are correct ()
a) Producer sends events to broker
b) Consumer consumes events from broker
c) Events are separated by topic, and each consumer belongs to a group
d) Consumers in the same group cannot consume events repeatedly, and the same event will be sent to every A
consumer of a different group
13. Which of the following operations must be wide dependent ()
A. map B.flatMap C. reduceByKey D. sample
14. Which of the following ports is not the port of spark's own service ()
A. 8080 B. 4040 C. 8090 D. 18080
15. Which of the following is the action operation of spark
a) map b) collect c) filter d) countByKey
16. What is wrong is ()
a) To start a new thread is to directly call the run() method
b) CyclicBarrier and CountDownLatch can be used to make a group of threads wait for other threads
c) If you manually end a thread, you can use a volatile boolean variable to exit run() method, cycle
or cancel the task to interrupt the thread
d) wait and notify methods must be called in the synchronization block
17. What is the difference between the metadata of hive stored in derby and MySQL ()
A. No difference B. Many Session C. Support for network environment D. Difference in database
18. Spark default storage level ()
A MEMORY_ONLY B MEMORY_ONLY_SER
C MEMORY_AND_DISK D MEMORY_AND_DISK_SER
19. What determines the number of tasks of Stape in Spark ()
A Partition B Job C Stage D TaskScheduler
20. The output of the following code is ()

public class Person{
private String name = "Person";
int age = 0;
}
public class Child extends Person{
public String grade;
public static void main(String[] args){
Person p = new Child();
System.out.println(p.name);
}
}


A) Output: Person
B) No output
C) Compile error
D) Run error

Second, the short answer part

1. What are the keywords of implicit functions in scala?

2. How is Hbase optimized?

3. What is the role of the combine function in hadoop?

4. How does Hadoop kill a job?

5. The concept of spark lineage?

6. Write the command in base shel1:

a) The query table in hbase is named test, whose value = 001

b) The name of the query table in hbase is test, and the rowkey starts with userl

Big data training

Guess you like

Origin blog.csdn.net/msjhw_com/article/details/109044025