AWS Assistant Architect Certification Training | Advanced S3 Introduction

Video source: Station B "AWS Certified Solution Architect Assistant Level SAA-C03"

Organize the teacher's course content and test notes while studying, and share them with everyone. Any infringement will be deleted. Thank you for your support!

Attach a summary post: AWS Assistant Architect Certification Training | Summary


Lifecycle Rules (with S3 Analitics)

Amazon S3 - Moving between Storage Classes

  • You can transition objects between storage classes
  • For infrequently accessed object, move them to Standard IA
  • For archive objects that you don't need fast access to, move them to Glacier or Glacier Deep Archive
  • Moving objects can be automated using a Lifecycle Rules

Amazon S3 - Lifecycle Rules

  • Transition Actions - configure objects to transition to another storage class Move objects to Standard IA class 60 days after creationMove to Glacier for archiving after 6 months
  • Expiration actions - configure objects to expire (delete) after some time Access log files can be set to delete after a 365 daysCan be used to delete old versions of files (if versioning is enabled)Can be used to delete incomplete Multi-Part uploads
  • Rules can be created for a certain prefix (example: s3://mybucket/mp3/*)
  • Rules can be created for certain objects Tags (example: Department: Finance)

Amazon S3 - Lifecycle Rules (Scenario 1)

  • Your application on EC2 creates images thumbnails after profile photos are uploaded to Amazon S3. These thumbnails can be easily recreated, and only need to be kept for 60 days. The source images should be able to be immediately retrieved for these 60 days, and afterwards, the user can wait up to 6 hours. How would you design this?
  • S3 source images can be on Standard, with a lifecycle configuration to transition them to Glacier after 60 days
  • S3 thumbnails can be on One-Zone IA, with a lifecycle configuration to expire them (delete them) after 60 days

Amazon S3 - Lifecycle Rules (Scenario 2)

  • A rule in your company states that you should be able to recover your deleted S3 objects immediately for 30 days, although this may happen rarely. After this time, and for up to 365 days, deleted objects should be recoverable within 48 hours.
  • Enable S3 Versioning in order to have object versions, so that "deleted objects" are in fact hidden by a "delete marker" and can be recovered
  • Transition the "noncurrent versions" of the object to Standard IA
  • Transition afterwards the "noncurrent versions" to Glacier Deep Archive

Amazon S3 Analytics - Storage Class Analysis

  • Help you decide when to transition objects to the right storage class
  • Recommendations for Standard and Standard IADoes NOT work for One-Zone IA or Glacier
  • Report is updated daily
  • 24 to 48 hours to start seeing data analysis
  • Good first step to put together Lifecycle Rules (or improve them)!

S3 Requester Pays

S3 - Requester Pays

  • In general, bucket owners pay for all Amazon S3 storage and data transfer costs associated with their bucket
  • With Requester Pays buckets, the requester instead of the bucket owner pays the cost of the request and the data download from the bucket
  • Helpful when you want to share large datasets with other accounts
  • The requester must be authenticated in AWS (can not be anonymous)

S3 Event Notifications

  • S3:ObjectCreated, S3:ObjectRemoved, S3:ObjectRestore, S3:Replication...
  • Object name filtering possible (*.jpg)
  • Use case: generate thumbnails of images uploaded to S3
  • Can create as many "S3 events" as desired
  • S3 event notifications typically deliver events in seconds but can sometimes take a minute or longer

S3 Event Notification with Amazon EventBridge

  • Advanced filtering options with JSON rules (metadata, object size, name...)
  • Multiple Destinations - ex Step Functions, Kinesis Streams / Firehose...
  • EventBridge Capabilities - Archive, Replay Events, Reliable delivery

S3 Performance

S3 - Baseline Performance

  • Amazon S3 automatically scales to high request rates,latency 100-200ms
  • Your application can achieve at least 3,500 PUT/COPY/POST/DELETE and 5,500 GET/HEAD requests per second per prefix in a bucket
  • There are no limits to the number of prefixes in a bucket
  • Example (object path => prefix):bucket/folder1/sub1/file => /folder1/sub1/bucket/folder1/sub2/file => /folder1/sub2/bucket/1/file => /1/bucket/2/file => /2/
  • If you spread reads across all four prefixes evenly, you can achieve 22,000 requests per second for GET and HEAD

S3 Performance

  • Multi-Part upload:recommended for files > 100MB, must use for files > 5GB Can help parallelize uploads (speed up transfers)
  • S3 Transfer AccelerationIncrease transfer speed by transferring file to an AWS edge location which will forward the data to the S3 bucket in the target regionCompatible with multi-part upload

S3 Performance - S3 Byte-Range Fetches

  • Parallelize GETs by requesting specific byte ranges
  • Better resilience in case of failures

Can be used to speed up downloads

Can be used to retrieve only partial data (for example the head of a file)

S3 Select & Glacier Select

S3 Select & Glacier Select

  • Retrieve less data using SQL by performing server-side filtering
  • Can filter by rows & columns (simple SQL statements)
  • Less network transfer, less CPU cost client-side

S3 Batch Operations

S3 Batch Operations

  • Perform bulk operations on existing S3 objects with a single request, example:Modify object metadata & propertiesCopy objects between S3 bucketsEncrypt un-encrypted objectsModify ACLs, tagsRestore objects from S3 GlacierInvoke Lambda function to perform custom action on each object
  • A job consists of a list of objects, the action to perform, and optional parameters
  • S3 Batch Operations manages retries, tracks progress, sends completion notifications, generate reports...
  • You can use S3 Inventory to get object list and use S3 Select to filter your objects

Guess you like

Origin blog.csdn.net/guolianggsta/article/details/131962021