AWS助理架构师认证培训 | Decoupling applications SQS, SNS, Kinesis, Active MQ

视频来源：B站《AWS 认证解决方案架构师助理级 SAA-C03》

一边学习一边整理老师的课程内容及试验笔记，并与大家分享，侵权即删，谢谢支持！

附上汇总贴：AWS助理架构师认证培训 | 汇总_热爱编程的通信人的博客-CSDN博客

Introduction to Messaging

Section Introduction

When we start deploying multiple applications, they will inevitably need to communicate with one another
There are two patterns of application communication

Synchronous between applications can be problematic if there are sudden spikes of traffic
What if you need to suddenly encode 1000 videos but usually it's 10?
ln that case, it's better to decouple your applications,
using SQS: queue model
using SNS: pub/submodel
using Kinesis: real-time streaming model
These services can scale independently from our application!

Amazon SQS - Standard Queues Overview

Amazon SQS What's a queue?

Amazon SQS - Standard Queue

Oldest offering (over 10 years old)
Fully managed service, used to decouple applications
Attributes:
Unlimited throughput, unlimited number of messages in queue
Default retention of messages: 4 days, maximum of 14 days
Low latency (<10 ms on publish and receive)
Limitation of 256KB per message sent
Can have duplicate messages (at least once delivery, occasionally)
Can have out of order messages (best effort ordering)

SQS - Producing Messages

Produced to SQS using the SDK (Send Message API)
The message is persisted in SQS until a consumer deletes it
Message retention: default 4 days, up to 14 days
Example: send an order to be processed
Order id
Customer id
Any attributes you want
SQS standard: unlimited throughput

SQS - Consuming Mes sages

Consumers (running on EC2 instances, servers, or AWS Lambda) ...
Poll SQS for messages (receive up to 10 messages at a time)
Process the messages (example: insert the message into an RDS database)
Delete the messages using the DeleteMessage APl

SQS - Multiple EC2 Instances Consumers

Consumers receive and process messages in parallel
At least once delivery
Best-effort message ordering
Consumers delete messages TT after processing them
We can scale consumers horizontally to improve throughput of processing

SQS with Auto Scaling Group (ASG)

SQS to decouple between application tiers

Amazon SQS - Security

Encryption:
In-fight encryption using HTTPS API
At-rest encryption using KMS keys
Client-side encryption if the client wants to perform encryption/decryption itself
Access Controls: IAM policies to regulate access to the SQS API
SQS Access Policies (similar to S3 bucket policies)
Useful for cross-account access to SQS queues
Useful for allowing other services (SNS, S3...) to write to an SQS queue

SQS - Message Visibility Timeout

After a message is polled by a consumer, it becomes invisible to other consumers
By default, the "message visibility timeout" is 30 seconds
That means the message has 30 seconds to be processed
After the message visibility timeout is over, the message is "visible" in SQS

If a message is not processed within the visibility timeout, it will be processed twice
A consumer could call the ChangeMessageVisibility API to get more time
lf visibility timeout is high (hours), and consumer crashes, re-processing will take time
lf visibility timeout is too low (seconds), we may get duplicates

SQS - Long Polling

Amazon SQS - Long Polling

When a consumer requests messages from the queue, it can optionally "wait" for messages to arrive if there are none in the queue
This is called Long Polling
Long Polling decreases the number of API calls made to SQS while increasing the efficiency and latency of your application.
The wait time can be between 1 sec to 20 sec (20 sec preferable)
Long Polling is preferable to Short Polling
Long polling can be enabled at the queue level or at the API level using WaitTimeSeconds

SQS - FIFO Queues

Amazon SQS - FIFO Queue

FIFO = First In First Out (ordering of messages in the queue)

Limited throughput: 300 msg/s without batching, 3000 msg/s with
Exactly-once send capabiliity (by removing duplicates)
Messages are processed in order by the Consumer

SQS + Auto Scaling Group

SQS with Auto Scaling Group (ASG)

If the load is too big, some transactions may be lost

SQS as a buffer to database writes

SQS to decouple between application tiers

Amazon Simple Notification Service (AWS SNS)

Amazon SNS

What if you want to send one message to many receivers?

The "event producer" only sends message to one SNS topic
As many "event receivers" (subscriptions) as we want to listen to the SNS topic notifications
Each subscriber to the topic will get all the messages (note: new feature to filter messages)
Up to 12,500,000 subscriptions per topic
100,000 topics limit

SNS integrates with a lot of AWS services

Many AWS services can send data directly to SNS for notifications

AWS SNS - How to publish

Topic Publish (using the SDK)
Create a topic
Create a subscription (or many)
Publish to the topic
Direct Publish (for mobile apps SDK)
Create a platform application
Create a platform endpoint
Publish to the platform endpoint
Works with Google GCM, Apple APNS, Amazon ADM...

Amazon SNS -Security

Encryption:
In-flight encryption using HTTPS API
At-rest encryption using KMS keys
Client-side encryption if the client wants to perform encryption/decryption itself
Access Controls: IAM policies to regulate access to the SNS APl
SNS Access Policies (similar to S3 bucket policies)
Useful for cross-account access to SNS topics
Useful for allowing other services (S3...) to write to an SNS topic

SNS and SQS - Fn Out Pattern

SNS + SQS: Fan Out

Push once in SNS, receive in all SQS queues that are subscribers
Fully decoupled, no data loss
SQS allows for: data persistence, delayed processing and retries of work
Ability to add more SQS subscribers over time
Make sure your SQS queue access policy allows for SNS to write

Application: S3 Events to multiple queues

For the same combination of: event type (e.g. object create) and prefix (e.g. images/) you can only have one S3 Event rule
lf you want to send the same S3 event to many SQS queues, use fan-out

Application: SNS to Amazon S3 through Kinesis Data Firehose

SNS can send to Kinesis and therefore we can have the following solutions architecture:

Amazon SNS - FIFO Topic

FIFO = First In First tOut (ordering of messages in the topic)

Similar features as SQS FIFO:
Ordering by Message GroupID (all messages in the same group are ordered)
Deduplication using a Deduplication ID or Content Based Deduplication
Can only have SQS FIFO queues as subscribers
Limited throughput (same throughput as SQS FIFO)

SNS FIFO + SQS FIFO: Fan Out

In case you need fan out + ordering + deduplication

SNS - Message Filtering

JSON policy used to filter messages sent to SNS topic's subscriptions
lf a subscription doesn't have a filter policy, it receives every message

Amazon Kinesis - Overview

Kinesis Overview

Makes it easy to collect, process, and analyze streaming data in real-time
Ingest real-time data such as: Application logs, Metrics, Website clickstreams, loT telemetry data...
Kinesis Data Streams: capture, process, and store data streams
Kinesis Data Firehose: load data streams into AWS data stores
Kinesis Data Analytics: analyze data streams with SQL or Apache Flink
Kinesis Video Streams: capture, process, and store video streams

Kinesis Data Streams Overview

Kinesis Data Streams

Retention between 1 day to 365 days
Ability to reprocess (replay) data
Once data is inserted in Kinesis, it can't be deleted (immutability)
Data that shares the same partition goes to the same shard (ordering)
Producers: AWS SDK, Kinesis Producer Library (KPL), Kinesis Agent
Consumers:
Write your own: Kinesis Client Library (KCL), AWS SDK
Managed: AWS Lambda, Kinesis Data Firehose, Kinesis Data Analytics,

Kinesis Data Streams - Capacity Modes

Provisioned mode:
You choose the number of shards provisioned, scale manually or using API
Each shard gets 1MB/s in (or 1000 records per second)
Each shard gets 2MB/s out (classic or enhanced fan-out consumer)
You pay per shard provisioned per hour
On-demand mode:
No need to provision or manage the capacity
Default capacity provisioned (4 MB/s in or 4000 records per second)
Scales automatically based on observed throughput peak during the last 30 days
Pay per stream per hour & data in/out per GB

Kinesis Data Streams Security

Control access / authorization using IAM policies
Encryption in flight using HTTPS endpoints
Encryption at rest using KMS
You can implement encryption/decryption of data on client side (harder)
VPC Endpoints available for Kinesis to access within VPC
Monitor API calls using CloudTrail

Kinesis Data Firhose Overview

Kinesis Data Firehose

Fully Managed Service, no administration, automatic scaling, serverless
AWS: Redshift / Amazon S3 / ElasticSearch
3rd party partner: Splunk / MongoDB / DataDog / NewRelic / ...
Custom: send to any HTTP endpoint
Pay for data going through Firehose
Near RealTime
60 seconds latency minimum for non full batches
Or minimum 1 MB of data at a time
Supports many data formats, conversions, transformations, compression
Supports custom data transformations using AWS Lambda
Can send failed or all data to a backup S3 bucket

Kinesis Data Streams vs Firehose

Kinesis DataStreams
Streaming service for ingest at scale
Write custom code (producer / consumer)
Real-time (~200ms)
Manage scaling (shard splitting / merging)
Data storage for 1 to 365 days
Supports replay capability
Kinesis Data Firehose
Load streaming data into S3 / Redshift / ES / 3rd party / custom HTTP
Fully managed
Near real-time (buffer time min. 60 sec)
Automatic scaling
No data storage
Doesn't support replay capability

Data Ordering for Kinesis vs SQS FIFO

Ordering data into Kinesis

Imagine you have 100 trucks (truck_1, truck_2, ... truck_100) on the road sending their GPS positions regularly into AWS.
You want to consume the data in order for each truck, so that you can track their movement accurately.
How should you send that data into Kinesis?
Answer: send using a "Partition Key" value of the "truck_id"
The same key will always go to the same shard

Ordering data into SQS

For SQS standard, there is no ordering.
For SQS FIFO, if you don't use a Group ID, messages are consumed in the order they are sent, with only one consumer

You want to scale the number of consumers, but you want messages to be "grouped" when they are related to each other
Then you use a Group ID (similar to Partition Key in Kinesis)

Kinesis vs SQS ordering

Let's assume 100 trucks, 5 kinesis shards, 1 SQS FIFO
Kinesis Data Streams：
On average you'll have 20 trucks per shard
Trucks will have their data ordered within each shard
The maximum amount of consumers in parallel we can have is 5
Can receive up to 5MB/s of data
SQS FIFO
You only have one SQS FIFO queue
You will have 100 Group ID
You can have up to 100 Consumers(due to the 100 Group ID)
You have up to 300 messages per second(or 3000 if using batching)

SQS vs SNS vs Kinesis

SQS:
Consumer "pull data"
Data is deleted after being consumed
Can have as many workers (consumers) as we want
No need to provision throughput
Ordering guarantees only on FIFO queues
Individual message delay capability
SNS:
Push data to many subscribers
Up to 12,500,000 subscribers
Data is not persisted (lost if not delivered)
Pub / Sub
Up to 100,000 topics
No need to provision throughput
Integrates with SQS for fan-out architecture pattern
FIFO capability for SQS FIFO
Kinesis:
Standard: pull data
- 2 MB pers hard
Enhanced-fan out: push data
2 MB per shard per consumer
Possibility to replay data
Meant for real-time big data, analytics and ETL
Ordering at the shard level
Data expires after X days
Provisioned mode or on-demand capacity mode

Amazon MQ

SQS, SNS are "cloud-native" services: proprietary protocols from AWS
Traditional applications running from on-premises may use open protocols such as: MQTT, AMQP, STOM, Openwire, WSS
When migrating to the cloud, instead of re-engineering the application to use SQS and SNS, we can use Amazon MQ
Amazon MQ is a managed message broker service for

Amazon MQ doesn't "scale" as much as SQS / SNS
Amazon MQ runs on servers, can run in Multi-AZ with failover
Amazon MQ has both queue feature (~SQS) and topic features (~SNS)

Amazon MQ - High Availability

发布于 2023-09-19 14:32・IP 属地广东

AWS助理架构师认证培训 | Decoupling applications SQS, SNS, Kinesis, Active MQ

Introduction to Messaging

Section Introduction

Amazon SQS - Standard Queues Overview

Amazon SQS What's a queue?

Amazon SQS - Standard Queue

SQS - Producing Messages

SQS - Consuming Mes sages

SQS - Multiple EC2 Instances Consumers

SQS with Auto Scaling Group (ASG)

SQS to decouple between application tiers

Amazon SQS - Security

SQS - Message Visibility Timeout

SQS - Message Visibility Timeout

SQS - Long Polling

Amazon SQS - Long Polling

SQS - FIFO Queues

Amazon SQS - FIFO Queue

SQS + Auto Scaling Group

SQS with Auto Scaling Group (ASG)

If the load is too big, some transactions may be lost

SQS as a buffer to database writes

SQS to decouple between application tiers

Amazon Simple Notification Service (AWS SNS)

Amazon SNS

SNS integrates with a lot of AWS services

AWS SNS - How to publish

Amazon SNS -Security

SNS and SQS - Fn Out Pattern

SNS + SQS: Fan Out

Application: S3 Events to multiple queues

Application: SNS to Amazon S3 through Kinesis Data Firehose

Amazon SNS - FIFO Topic

SNS FIFO + SQS FIFO: Fan Out

SNS - Message Filtering

Amazon Kinesis - Overview

Kinesis Overview

Kinesis Data Streams Overview

Kinesis Data Streams

Kinesis Data Streams - Capacity Modes

Kinesis Data Streams Security

Kinesis Data Firhose Overview

Kinesis Data Firehose

Kinesis Data Firehose

Kinesis Data Streams vs Firehose

Data Ordering for Kinesis vs SQS FIFO

Ordering data into Kinesis

Ordering data into SQS

Kinesis vs SQS ordering

SQS vs SNS vs Kinesis

Amazon MQ

Amazon MQ

Amazon MQ - High Availability

猜你喜欢