Message notification (Notification) system optimization

Click "JavaEdge" below and select "Set as Star"

Pay attention to technical information as soon as possible!

Disclaimer~

Don’t overthink any article!

Everything cannot withstand scrutiny, because the world does not have the same growth environment, nor the same level of cognition, and“There is no solution that applies to everyone”< /span>;

Don’t rush to judge the views listed in the article, just put yourself into it and take a moderate look at yourself. You can “jump out and look at the current situation from an outsider’s perspective. What stage are you in before you are a commoner?.

What you think and do is all up to you"Find the path that suits you through constant practice"

5 Process for collecting contact information

In order to send notifications, various information such as mobile device tokens, email, phone and third-party channel information need to be collected.

696b70ffb1843b414236b3a4fca31799.png

A simplified database table schema for storing contact information. It's a single NoSQL DynamoDB table with email, phone, device token, and external channels. Contacts table schema:

device_tokens should be stored in JSON format. Example:

[
 {
   "deviceToken": "[设备令牌UUID]",
   "platform": "apns"
 },
 {
   "deviceToken": "[设备令牌UUID]",
   "platform": "fcm"
 }
]

external_channels field

[
  {
      "platform": "slack",
      "url": "[通道的唯一URL]",
      "status": true
  },
  {
      "platform": "another-service",
      "url": "...",
      "status": false
  }
]

Users can have multiple devices and third-party channels, which means push notifications can be sent to all devices of the user.

6 Notification sending and receiving process

Initial design of notification system:

d45882c32d6a00b414b3b6f618f5cba9.png

Picture from left to right:

External Producer 1~N — Represents different services that wish to send notifications through the API provided by the notification system. For example, billing services send text messages to remind customers that payment is due, or shopping sites deliver delivery messages to their customers.

API Gateway will provide an API interface to the producer and correctly route the request to the notification service (Lambda).

Notification service is similar to the backend service and has the following functions:

  • Perform basic verification to verify email, phone number, device token, and more.

  • Query the database to obtain the data required to generate the notification event.

  • Push notification data to the event bus for parallel processing.

Contact Database — DynamoDB table that stores data about users, contact information, settings, etc.

EventBridge, an AWS service, uses it as an event bus. You also need to defineevent rules to properly route events to the queue.

This is an example of a notification event. Each detail-type will target a notification type. Therefore, SQS queues filter events based on attribute patterns.

{
  "id": "<required::uuid>",
  "source": "payment_request_event",
  "detail-type": ["payment_notification_sms"],
  "resources": ["payments"],
  "detail": {...}
  "time": "<required>",
  "region": "<required>",
  "account": "<required>"
}

Message Queues — They are used to eliminate dependencies between components. SQS queues act as buffers when a large number of notifications need to be sent. Each notification event type is assigned to a separate message queue so that an interruption in one sending service does not affect other notification types.

Worker — Polls notification events from the SQS queue and sends them to the corresponding service's Lambda service list.

SNS or third-party service — These services are responsible for delivering notifications to consumers. When integrating with third-party services, we need to focus on scalability and high availability. A great example of scalability is a flexible system that can easily switch third-party services on/off. Another important consideration is that a third-party service may be unavailable to some extent, and then we should be able to switch to another service with minimal impact to the business.

7 Optimization

In the high-level design, we discussed the three main parts of the notification system: the different types of notifications, the process for collecting contact information, and the notification sending/receiving process. The key is:

  • Security in events and push notifications

  • Notification templates and settings

  • Reliability and resiliency

  • Retry mechanism

  • rate limit

  • Monitor notifications and event tracking in queues

Event and push notification security

  • In cases where sensitive data is stored, we should enable DynamoDB's data protection such as encryption at rest and integrate AWS Key Management Service (AWS KMS) to manage the encryption keys used to encrypt tables. And use IAM roles to authenticate access to DynamoDB.

  • Enforce the principle of least privilege when accessing resources

  • Enable EventBridge's data protection by using SSL/TLS to communicate with AWS resources for encryption in transit. It is recommended to use TLS 1.3.

  • For iOS and Android apps, appKey and appSecret are used to secure push notification APIs. Only authenticated or authenticated clients are allowed to send push notifications using the API. These credentials should be stored and encrypted via Secret Manager or Parameter Store.

Notification templates and settings

  • We should create a notification template for the same notification type which follows a similar format. It can be reused and avoids building each notification content from scratch.

  • Notification templates are pre-formatted notification content through custom parameters and tracking links.

etc. to create unique notifications. We can store these notification templates in S3 buckets with defined prefixes.

  • To provide users with fine-grained control over notification settings, we can store them in a separate notification settings table. Before sending any notification to the user, we first check if the user is willing to receive this type of notification.

Reliability and resiliency

  • Prevent data loss - One of the most important non-functional requirements in a notification system is that data cannot be lost. Notifications may be delayed or reordered, but should not be lost. To meet this requirement, the notification system persists notification data in another log table and implements a retry mechanism.

  • Receive a notification exactly once? — No, it cannot be done. Although notifications are delivered exactly once most of the time, according to the third-party service provider's SLA, the distributed nature may result in duplicate notifications. We can reduce the occurrence of duplication, then introduce deduplication mechanisms and handle failures carefully.

  • This is a simplified logic: when the notification event first comes, we check eventId to see if it has been delivered before. If previously passed successfully, it is discarded. Otherwise, we will send a notification.

  • Resilient infrastructure — We should consider deploying in multiple Availability Zones, where you can design and operate applications and databases that can automatically failover between Availability Zones without disruption. Availability Zones are more highly available, fault tolerant, and scalable than traditional single or multi-data center infrastructure.

Retry mechanism

  • When the SNS/3rd party service fails to send a notification, the notification will be added to the dead letter queue for retry. If the problem persists, an alert will be sent to the responsible developer.

rate limit

  • We should consider sending notices politely. To avoid sending too many notifications to users, we can make the notification system more polite by using SQS and limiting the number of notifications a user can receive in a period of time.

Monitor notifications and event tracking in queues

  • We should use AWS CloudWatch metrics monitoring notification system. Key metrics to monitor are the total number of events and the total number of queued notifications in EventBirdge. If these two indicators are large, then the notification event is not processed quickly by the staff. This means we should expand and need more staff.

  • Event Tracking — Some important custom metrics like open rates, click-through rates, and engagement are important for understanding customer behavior. We should assign status to events: Created → Pending → Sent → Opened → Clicked or Error, Unsubscribed. By integrating event status into the notification system, we can track notification events.

Updated high-level architecture

d3e80de545fb992623a41fb610a094af.png

Optimized notification system with AWS

8 Conclusion

The article highlights the indispensability of notifications in keeping us informed of critical information. Designed to illustrate a blueprint for a scalable, highly available, and reliable notification system that can accommodate a variety of notification types, including mobile push notifications, SMS, email, and third-party app notifications.

To achieve my goals, I chose an event-based architecture, using EventBridge and SQS queues to decouple system components.

Designed to make extensive use of AWS services, adopting a serverless framework, this choice not only ensures efficiency but also minimizes pricing and operating costs.

The design follows the principles of a twelve-factor application, treating support services as additional resources, storing configurations in the environment, and treating logs as streams of events, among other considerations.

reference:

  • Programming Selection Network

write at the end

Programming Select Network (www.javaedge.cn), a lifelong learning website for programmers, is now online!

Click to read the original text and visit the website!

Welcoming长按图片加好友, Our first time together and sharing软件行业趋势, 面试资源, 学习途径etc.

6a059a93533ef746d023540ccf151d99.jpegAdd friends' notes [Technical Group Communication] to bring you into the group, where you can find more tutorial resources.

After following the official account, send a private message in the background:

  • Reply[Architect], get the architect learning resource tutorial

  • Reply to [Interview] to obtain the latest and most complete interview materials for major Internet companies.

  • Reply to [Resume] to get a variety of resume templates with beautiful styles and rich content.

  • return route line, Java P7Technology management complete list of best learning routes

  • ReplyBig Data, get Java transformation The most comprehensive mind map of the entire Internet for big data research and development

  • WeChat [ssshflz] private message [Side Business], join the side business communication group

  • Click[Read the original text] to access One-stop learning website for programmers

Guess you like

Origin blog.csdn.net/qq_33589510/article/details/135027804