Spark Join the optimization experience
The difference between the Spark and flink
The difference between Kafka and traditional MQ
1. The terms of architecture model
RabbitMQ follow AMQP protocol, RabbitMQ of brokerExchange, Binding, queue, where exchange and binding up the routing key message; client Producer communicate through connecting channel and server, Consumer acquires a message from a queue for consumption (long link, queue news pushed to the end consumer, consumer cycle to read data from the input stream). rabbitMQ broker to the center; confirmation message mechanism.
Compliance MQ kafka general structure, the saved producer, broker, consumer, as the consumer to the center, the message consumer information consumer client, according to the consumer's point of consumption, bulk pull data from the Broker; no message confirmation mechanism.
2. Throughput
kafka has a high throughput, using the internal message batch processing, zero-copy mechanism, storing and retrieving data in a local disk is sequential batch operations, with complexity O (1), the high efficiency of message processing.
RabbitMQ Kafka less in terms of throughput, their starting point is not the same, RabbitMQ supports reliable delivery of the message, transaction support, batch operation is not supported; storage reliability based on the required storage memory or a hard disk may be employed.
3. Availability
rabbitMQ support queue miror, the main queue fails, miror queue to take over.
The broker supports kafka standby mode.
4. Load balancing
kafka zookeeper in the cluster using the broker, Consumer management, can be registered to the topic zookeeper; zookeeper by coordination mechanism, the producer stored broker information corresponding to the topic, or may be randomly sent to polling broker; producer may be based on semantics and to specify fragment, a message is sent to the broker's slice.
Kafka application and understanding
Benefits Flume fanout applications
Elasticsearch main application and understanding
The system scheduling period long
Usually a day or more a week, depending on their business project may be.
Scheduling designed to achieve understanding
Product manager responsibilities and contact your
Product manager mentioned demand,
Understand the classification clustering algorithm, and the realization of the project
d
Project appears challenges
Explain the steps of the project
Hive implementation plan
https://www.cppentry.com/bencandy.php?fid=117&id=201834
What a hive dynamic partitioning and use of sub-barrel hive
slightly.
sqoop tool you can export conditions
You can export conditions
sqoop how to write code
sqoop import --connect jdbc:mysql://192.168.1.1:3306/events --username root --passwd 123456 --table student --hive-import --hive-table student -m 1