Sophisticated interview questions (five)

Spark Join the optimization experience

The difference between the Spark and flink

The difference between Kafka and traditional MQ

1. The terms of architecture model

RabbitMQ follow AMQP protocol, RabbitMQ of brokerExchange, Binding, queue, where exchange and binding up the routing key message; client Producer communicate through connecting channel and server, Consumer acquires a message from a queue for consumption (long link, queue news pushed to the end consumer, consumer cycle to read data from the input stream). rabbitMQ broker to the center; confirmation message mechanism.

Compliance MQ kafka general structure, the saved producer, broker, consumer, as the consumer to the center, the message consumer information consumer client, according to the consumer's point of consumption, bulk pull data from the Broker; no message confirmation mechanism.

2. Throughput

kafka has a high throughput, using the internal message batch processing, zero-copy mechanism, storing and retrieving data in a local disk is sequential batch operations, with complexity O (1), the high efficiency of message processing.

RabbitMQ Kafka less in terms of throughput, their starting point is not the same, RabbitMQ supports reliable delivery of the message, transaction support, batch operation is not supported; storage reliability based on the required storage memory or a hard disk may be employed.

3. Availability

rabbitMQ support queue miror, the main queue fails, miror queue to take over.

The broker supports kafka standby mode.

4. Load balancing

kafka zookeeper in the cluster using the broker, Consumer management, can be registered to the topic zookeeper; zookeeper by coordination mechanism, the producer stored broker information corresponding to the topic, or may be randomly sent to polling broker; producer may be based on semantics and to specify fragment, a message is sent to the broker's slice.

Kafka application and understanding

Benefits Flume fanout applications

Elasticsearch main application and understanding

The system scheduling period long

       Usually a day or more a week, depending on their business project may be.

Scheduling designed to achieve understanding

Product manager responsibilities and contact your

       Product manager mentioned demand,

Understand the classification clustering algorithm, and the realization of the project

d

Project appears challenges

Explain the steps of the project

 

Hive implementation plan

https://www.cppentry.com/bencandy.php?fid=117&id=201834

 

What a hive dynamic partitioning and use of sub-barrel hive

       slightly.

sqoop tool you can export conditions

       You can export conditions

sqoop how to write code

sqoop import --connect jdbc:mysql://192.168.1.1:3306/events --username root --passwd 123456 --table  student  --hive-import --hive-table student -m 1

Guess you like

Origin www.cnblogs.com/lingboweifu/p/11909785.html