Technical Experts: Why we chose Apache Pulsar alternative Kafka?


Introduction: In traditional messaging systems, there are some problems. Storage and service aspect, the message is generally tightly coupled, and the expansion node operation and maintenance inconvenient, especially in a multi-backup needed to ensure high availability of the scene. On the other hand, consumption patterns message is fixed, in-house systems need to maintain multiple sets of different messages to ensure consumer scene. Another message system, multi-tenant, multi-room mutual support and other enterprise-class features and functionality not very rich.

Apache Pulsar using a layered architecture, stores the calculated coupling solution, while providing good scalability and maintainability. Pulsar also subscribe through abstraction layer, it provides a unified messaging consumption model. Pulsar especially in the beginning of the design, the focus on the need for aspects of multi-tenant, multi-room mutual support, etc., provides a number of comprehensive enterprise-class features.

Apache Pulsar From the beginning of 2015 large-scale deployment in the world nearly ten room inside Yahoo, the Yahoo stable internal mail service, finance, Flickr, advertising, NoSQL and many other scenarios, creating a total of more than 80 tenants, more than 230 ten thousand topic . Zhaopin 18 years in line to replace the original RabbitMQ with Pulsar, as more than 20 application message bus service inside the interior, will produce more than 6 billion messages and data 3TB per day. While reducing hardware, operational costs of peacekeeping deployment, the system provides a better quality of service and scalability.

Apache Pulsar, is provided a use of Apache Bookkeeper persistent pub / sub messaging platform that provides the following features:

Persistence: using BookKeeper as a storage layer, flexible.

Ordering: Each message has a globally unique ID, a message retransmission simple.

Delivery Guarantees:At least once, at most once 和 effectively once。

High Throughput: single partition message up to 1.8 M / sec.

Low latency: 99% of the production delay is less than 5 ms.

Unified Messaging model: supports both consumption model, flow and queue.

Multi-tenant: single cluster can support multi-tenant and use cases.

Cross-regional Replication: Native available.

High availability, high scalability, ease of operation and maintenance

Architecture Overview


Pulsar using a hierarchical structure, the storage mechanism to isolate the broker. This architecture is Pulsar provides the following benefits:

1, independent expand Broker , Producer handles the message sent and distributed to consumers. To handle a variety of tasks through a collaborative global ZK cluster, for example, said location-based replication. And the message is stored BookKeeper, but also need to have a single cluster ZK cluster to store some metadata.

2, independent extended storage (bookies)

3, easier containerization Zookeeper, Broker and Bookies

. 4, the ZooKeeper provide configuration and status of the storage cluster


Highlights are as follows:

1, the load balancer: Pulsar built-in load balancer, the load can be distributed to all internal broker

2, the service discovery: Pulsar having a built-in service discovery, may identify where and how to connect to the broker.

3, the global replicator: can replicate data among the N borker configured to the same namespace.

4, the global ZK: Global ZK for cross-regional Copy

Cross-regional Copy

Pulsar replication across geographies is the solution offered. Global cluster can be configured in the name space level, so any number of clusters (n-wayMesh solution) for replication. From the following examples, the data center C no consumer, but the data A or B centers will still consume messages according to the subscription model.


Multi-tenant

Multi-tenancy features through the isolation of data storage, help build Pulsar cluster for the enterprise. This built-in feature will greatly reduce the organization's infrastructure construction and operating costs.

Rebalancing time zero

Pulsar's layered architecture and proxy stateless nature contribute to achieving zero Rebalancing time. If a new broker being added to the cluster, it will be immediately available; no need rebalancing data in the cluster.

From the viewpoint of Bookies: When a new Bookie added to the cluster, due to its underlying architecture distributed log (read / write isolation), the node can write data once. Replication configuration based on the segment data rebalance in the background, it will not have any impact on the cluster.

Unified queue and flow model

Pulsar model using the same support and flow queue semantics. This feature can be achieved through a subscription model. Consumers subscribe to a topic using any of a subscription model:

1, Exclusive - Support for streaming semantics

2, Failover - Support for streaming semantics

3, Shared - support queue semantics


function

It is able to function in the presence of internal or external local Pulsar listener. From the use itself, the function can be used for content-based routing, which will help enterprise application route anticipated news.

Proxy

When the broker or Kubernetes deployed in the cloud, you need to use the proxy to broker exposed to the outside world. Proxy itself can provide authentication and authorization. Proxy built broker passes the authorization token to verify permissions for the namespace functionality.

in conclusion

Apache Pulsar use pub / sub model based on a layered architecture, it has a cross-regional replication, multi-tenant, zero Rebalancing time and other functions.

Original Address: https: //medium.com/@pckeyan/apache-pulsar-gentle-introduction-465ca6da0e18

Author: Karthikeyan Palanivelu, translated by radius


I wonder if you read this article, would have been attracted by Pulsar? It does make up some of the shortcomings of the competing products, e.g. replication region, multi-user, scalability, isolation and the like to read and write. If you want to know more about Apache Pulsar, then took part in this month 21-23, GIAC global Internet infrastructure by the General Assembly msup and high availability co-sponsored it!

PMC member and Apache Pulsar and Apache BookKeeper Committer Zhai Jia will attend the GIAC Shenzhen station , a special lecturer middleware share "the next generation of distributed messaging system Apache Pulsar" topic. GIAC participated in 2019 in Shenzhen station , you can understand the industry dynamics, and industry experts close contact.


This session, the organizing committee also invited to the 105 from Google, Microsoft, Oracle, eBay, Baidu, Alibaba, Tencent, Shang Tang, Tucson, byte beating, Sina, the US group reviews and other guests as a first-line Internet giant lecturer to attend and share their experiences, problems encountered and solutions. Now the few remaining seats, come to recognize the figure of two-dimensional codes Register now!




Reproduced in: https: //juejin.im/post/5d05c6a96fb9a07eaf2b8fa0

Guess you like

Origin blog.csdn.net/weixin_34320724/article/details/93170294