Amazon CloudFront Deployment Guide (2) - Advanced Deployment

caa1e65eb596f55f44691e77b998150d.gif

brief introduction

In this blog post, you'll learn more about Amazon CloudFront's features that help you define how content is delivered on demand, improving service performance and availability.

Below we will show you how to better use CloudFront for acceleration settings from the configuration and classic requirements.

1. Build and test dynamic origin site

2. Source site settings

3. Path matching and caching strategy

4. Back-to-origin request and response header strategy

5. Error response settings

6. Cache invalidation

Through this guide, you will learn how to use CloudFront to build more configurations to meet actual business needs.

Based on the S3 static origin station in the first guide, we will build a minimal origin station architecture with both dynamic and static.

Architecture diagram:

acef33adf84c2b5107a10fcd9cb2facb.png

Build and test dynamic origin site and display page

In the small guide 1, we have built a static origin site with a minimal architecture based on S3. In order to better match the following settings, we need to build an origin site that responds to dynamic content. For the convenience of demonstration, an echo- The Docker image of the server creates an httpecho container so that it can extract all the request headers we access the server through http as an http response to our browser.

First enable a t2.micro EC2, using the Amazon Linux2 AMI to launch.

To install Docker on this EC2, refer to https://docs.aws.amazon.com/zh_cn/AmazonECS/latest/developerguide/docker-basics.html.

The specific Linux commands are as follows:

1. Install docker

sudo yum update -y
sudo amazon-linux-extras install docker

Swipe left to see more

2. Give the ec2-user user the linux permission to start docker

sudo usermod -a -G docker ec2-user

Swipe left to see more

3. Exit the current ssh session to make the permissions take effect

logout

4. Use ec2-user to log in again to check the docker information

sudo service docker start
docker info | grep Ver

If the output is as follows, the docker installation is successful:

Server Version: 20.10.7
Cgroup Version: 1
Kernel Version: 4.14.47-64.38.amzn2.x86_64

Swipe left to see more

5. Run echo-server docker image and listen on TCP port 1028

docker run -d --name httpecho  -p 1028:8080 jmalloc/echo-server httpecho

Swipe left to see more

6. curl check local service

curl 127.0.0.1:1028

The following output indicates that the service is normal. The output indicates that the echo-server server received the http request sent by curl, and the content is very simple:

Request served by 16e51706efbe


HTTP/1.1 GET /


Host: 127.0.0.1:1028
User-Agent: curl/7.79.1
Accept: /

Swipe left to see more

Next, in order to display the effect more easily, we will use a simple HTML page to combine static elements and dynamic elements. You can first save the following html code as index.html and upload it to S3:

<!DOCTYPE html>
<html lang="en">
<body>
<table border="1" width="600px" height="800px">
<thead>
<tr><td height="50px"><h1>CloudFront Lab</h1></td></tr>
</thead>
<tfoot>
<tr><td height="50px">AWS Edge Services - Demo</td></tr>
</tfoot>
<tbody>
<tr><td height="50px">Response sent by API</td></tr>
</tbody>
<tbody>
<tr><td height="300px"> <img src='../infra.png' style="width:100%; height:100%;"></img></td></tr>
</tbody>
<tbody>
<tr><td height="650px"> <iframe src='../api' style="width:100%; height:100%;"></iframe></td></tr>
</tbody>
</table>
</body>
</html>

Swipe left to see more

Path matching and caching strategy

Before configuring the cache policy, we need to understand how CloudFront finds the corresponding policy based on the path identification in the behavior (Behavior). When configuring the path matching (Path Pattern), we need to follow the following rules:

  • Execute in order, the smaller the rule number, the higher the priority

  • * matches 0 or more characters

  • Case Sensitive

  • ? matches 1 character

  • does not support regular expressions

At the same time, in order to make the cache results meet expectations, we also need to understand the TTL settings in the cache policy (Cache Policy) and how the cache key (Cache Key) settings work:

Regarding the TTL setting, here is an example of the setting in the following figure:

97569bb07c65e6220037f754044b81e1.png

CloudFront will decide how long to cache according to the cache-control or expire of the source server response, combined with the TTL setting in the cache policy. Here are three examples to illustrate the final effect:

  • The origin server responds with Cache-Control: max-age=3600. Since 3600 falls within the interval between the maximum and minimum TTL values ​​of 1 – 86400, CloudFront will cache for 3600 seconds

  • The origin site responds with Cache-Control: max-age=99999, since 99999 exceeds the TTL maximum value of 86400, CloudFront will cache for 86400 seconds

  • The origin site does not respond to cache-control or expire. Since the default TTL (Default TTL) is set to 60 seconds, CloudFront will cache for 60 seconds

Regarding the setting of the cache key value, CloudFront can recognize the following three request elements, and then cache the specified element as the cache key value: request header/parameter/Cookie

165c0490ae72a390bae3444c43658496.png

In addition to the request headers carried by users as cache key values, CloudFront also has many built-in request headers for users to identify the user's device type/geographical location, etc. For details, please refer to – Adding CloudFront Request Headers (https:/ /docs.aws.amazon.com/zh_cn/AmazonCloudFront/latest/DeveloperGuide/adding-cloudfront-headers.html), you can combine this header information to distinguish cached content according to specific business needs.

In addition, when compression support is enabled in the service, CloudFront will also cache separately according to different compression formats, and return corresponding content according to the accept-encoding header sent by the requester:

545a3ce00cbc6fe87e3b6f9e568aa112.png

Regarding the setting of the cache key value, in order to ensure a good and healthy cache hit rate when using CloudFront, we need to follow the principle of "do not add the cache key value unless necessary" . The following is an example of a commonly used cache key value scenario:

A certain static element in the page needs to be identified as the cache version number according to the v parameter carried by the user, and the cache is distinguished. To meet this requirement, we can set the cache key value as follows:

105df6277d5983ddbeccab341aa5abf1.png

Combining the above principles and our prepared source site environment, the following two examples will explain the two commonly used scenarios in websites——

Scenario 1:   The path ending in webp in the website needs to be cached, and the cache version needs to be distinguished according to the parameter v, and the cache time is forced to be set to 86400

According to the scene requirements, we can make the following settings:

Path matching:

f063169bbd8c096fee779bc622c0bf11.png

Cache TTL and cache key value settings, we can customize the way to build the cache in Policies – Cache, or on the Edit Behavior page:

ec33bb3746e640b840ab5971aef31d9b.png

Cache TTL and cache key value settings, save after setting:

bc1bbbd43d600dc4d961bf88b6bd3b8a.png

Select the Cache Policy you just created and save:

2e4dff039a2894ec3023c4e1758c3b45.png

Scenario 2: The /api path in the website, back to the source EC2, and no cache

On the basis of the small guide 1, we already have the source site of S3 in our settings, we first create the EC2 source, and select the EC2 just created:

3de2e78af7e7b1bd66f456ceab950473.png

Note: In this experiment, the port that EC2 echoserver listens to is 1028. You need to pay attention to specifying the port when creating the EC2 source, enable HTTP/HTTPS, and keep HTTPS port 443 unchanged for the next experiment.

We can set the following path matching, and the cache policy can use the cache policy managed by CloudFront – Managed-CachingDisabled, the specific settings are as follows:

eea784196ebf22c3393cfd9192110091.png

Similarly, we also set index.html as S3 as the source site/CachingDisable.

2222f7458e0656817d1a154b94084f0d.png

Effect test:

In the actual test effect, we can use CloudFront's native X-Cache response header to observe whether the cache status meets expectations in the browser.

Use the HTTP method to access your index.html page (such as http://xxx.xxx.com/index.html), and after refreshing the page repeatedly, you can see that the X-Cache status of infra.png for this request is Hit.

df2bf7e962744d3d3a65db0d3a58dd9e.png

However, the /api path is not cached because the cache policy is set, and the status is still Miss after multiple refreshes.

6a622235fa0456fca35df949e9b4dabf.png

source group settings

While there are many different ways to increase the availability of your website, such as using Elastic Load Balancing and Multi-AZ if the origin server hosting your website is in the Amazon cloud, CloudFront brings higher availability to your website.

Website availability is most often affected by network failures/server outages or unavailability of content, but there are many factors that can affect website availability. For example, website downtime may be due to unexpected hardware failure. You can mitigate this type of risk by making all components fully redundant. In the origin setting of CloudFront, the function of origin group (Origin Group) is provided, and you can provide multiple redundancy for the origin server endpoint to avoid business interruption due to failure of an origin server or content unavailability.

5023c2f93547336c562aa4192bc05c1f.png

Note: If the origin site you use is a non-Amazon resource, such as other cloud service computing resources, it is recommended that you enable the Origin Shield function when setting up the origin site to make full use of the backbone network to maintain the best performance of the business and availability.

As shown in the above screenshot, if you need to set a source group during actual configuration, you need to set at least 2 or more sources before you can set the source group, and you can flexibly specify the master-backup relationship between the sources , and according to the status code responded by the source, automatically failover the request that meets the specific status code.

Let's take an example below. When the accessed object does not exist on S3, CF will automatically failover to the backup origin site EC2 to fetch the content:

94a1f2ff1a2f452353bacbee243bbe07.png

After creating the Origin Group, set a behavior for testing in Behavior, and apply the origin group created above to this behavior:

bfd622eb6e7e853dfc59b0c2cca77a0a.png

Browser test results:

*Since EC2 only has HTTP enabled, please use HTTP access test instead of HTTPS.

6df942495d43a3d848f2e5b7c2050880.png

We can see that when accessing content that does not exist, what is displayed on the page is not the error message of S3 on the main source, but the content can be automatically obtained from the backup source EC2, which proves that the setting is successful.

Back-to-origin request and response header strategy

When using CloudFront to accelerate page services, you can also decide what necessary information CloudFront should carry back to the source when returning to the source, and set the corresponding response header policy in CloudFront when responding, similar to the cache policy, you can request in CloudFront origin The origin request policy (Origin Request Policy) specifies the request header/parameter/cookie that needs to be carried back to the origin. At the same time, you can also flexibly formulate the strategy of the response header, and add, delete, and modify the specified response header. For better understanding, The following is an application case -

Due to business requirements, we need to implement the following three strategies on CloudFront:

  • The origin station deploys and listens to multiple Hostnames, and CloudFront needs to carry the Host header requested by the user for the origin station to distinguish;

  • The business side needs CloudFront to carry the user's country information back to the source for the business side to collect and count information;

  • Customize the response header x-cdn: CloudFront in the response header.

According to the above requirements, you can make the following related settings:

  • Set the source request policy to carry the built-in geographic location information header of Host and CloudFront

69f2a68ce081c8dce032d2b112536e0e.png

  • Custom setting response header strategy

70a5801020e5e274a574d22d68df709a.png

After creation, apply and deploy in the behavior path " /api ":

9a6e1c495a94c7a18b7c0c55c75ac012.png

Effect test:

Taking advantage of the fact that the response content of the echo server in the experimental environment is the user's request header, we can observe the effect of the configuration well.

Before deploying the above request header and response header logic, we can see that the request header carries the host back to the source as the EC2 domain name, and the response header does not carry CORS information:

c19a1d80ee3e7d3803a8ba675ee00e5f.png

After the logic is deployed and compared, we can see that the host information requested by the user has been successfully carried back to the source, and the user's geographical location information has been carried in the request header, and our custom response header information is also included in the response header:

f8e51f2a670e8d7000cd912d8526486e.png

*In the above test results, we can see an interesting phenomenon, that is, the UA is Amazon CloudFront. This is because we did not specify to bring UA back to the origin when formulating the Origin Policy. In actual applications, the origin service often needs to respond to the request header to obtain more user request characteristics, and you can specify which request elements CloudFront should carry back to the origin according to your needs.

Custom Error Responses

When using CloudFront for content distribution, we may encounter situations such as file non-existence/origin server maintenance/origin service timeout, etc., causing the client to get an error response or the origin service to receive too many error requests. To alleviate this situation, CloudFront can customize the error response code cache time and response content, so as to alleviate the origin site from continuously receiving requests that trigger error response codes, and provide a more user-friendly error page.

Here we take an example scenario where the business side needs to cache the 502 response code of the source service response for 10 seconds and respond to a custom error page. We can make the following settings to meet the above requirements:

Create a custom error response

3f4915806d778f15489766223fe5a962.png

Make the following settings, specify a 502 cache time of 10 seconds, and set the path of the sorry page, and make a 200 response.

ef7fc3e35bf1a9abbc6e7aec01e1254f.png

A simple html text page is provided here:

<!DOCTYPE html>
<html>
<head>
<title>Welcome to CloudFront!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Sorry, Your content is not available!</h1>
</body>
</html>

Swipe left to see more

We upload the html to the S3 origin site as an error page for display:

e5e960feebce60271050a182d6aa0ba1.png

After the above operations are completed, we can start to test the effect. Since we did not open port 443 and deploy the HTTPS certificate when deploying EC2, if we use HTTPS to access CloudFront at this time, when CloudFront tries to use HTTPS to return to the source EC2, the request will time out and generate 502 error.

Using this mechanism, we can simulate the situation of the source service 502, and use HTTPS to access index.html. Before deploying the customized error response, we can see the following effects:

6b312c59604bfffbda82f030f84ffbea.png

After deploying a custom error response:

180758c0cc102039ed16f38eb54e4ffa.png

cache invalidation

In actual business, you may encounter some scenarios that require resource changes and cache invalidation at the same time. On the CloudFront Distribution configuration interface, you can find the cache invalidation entry and perform cache invalidation actions. After ensuring that the source resources are changed, You can perform cache invalidation on CloudFront.

a02aa37ad7fe7c5a24b3ceeaf8d91acf.png

Next, we use the built environment to test the cache refresh, visit /index.html until the static content shows the Hit state, as shown in the following figure:

43e04a26320f3a733b1a953f746b0768.png

Next, on the CloudFront invalidation interface, add the URL path that needs to invalidate the cache:

f43683ad82a4db48543584800c56d8eb.png

10f34ca0cbd08affff469c71bf52632b.png

After submitting and confirming that the invalidation is complete, visit the /index.html page again, and you can see that the invalidation action is completed, and the cache status is Miss when you visit again.

c458bb8bb1a51f0f99937aea6d2ee572.png

Summarize

In this small guide, we have learned how to further use CloudFront to achieve more flexible settings, including customizing the cache policy/setting the primary and backup logic of the source service/making a request for back-to-origin/modifying the response headers required by the business and how to customize For error response, according to the guidelines in this article, you can flexibly construct CloudFront's behavior and how to respond to customer requests according to business needs.

Amazon cloud technology 

CloudFront deployment mini-guide series articles

Amazon CloudFront Deployment Guide (2) - Advanced Deployment:

https://aws.amazon.com/cn/blogs/china/amazon-cloudfront-deployment-handbook-part-two/

Amazon CloudFront Deployment Guide (3) - Continuous Deployment:

https://aws.amazon.com/cn/blogs/china/amazon-cloudfront-deployment-handbook-part-three/

Amazon CloudFront Deployment Handbook (4) - CloudFront Function Basics and Diagnosis: https://aws.amazon.com/cn/blogs/china/amazon-cloudfront-deployment-handbook-part-four/

Amazon CloudFront Deployment Handbook (Part 5) - Using Amazon Edge Technology to Optimize In-Game Resource Update Release: https://aws.amazon.com/cn/blogs/china/amazon-cloudfront-deployment-handbook-part-five/

Amazon CloudFront Deployment Handbook (6) - Lambda@Edge Basics and Diagnosis: https://aws.amazon.com/cn/blogs/china/amazon-cloudfront-deployment-handbook-part-six/

The author of this article

16b2e7c7b06d60761b85703a69453c4d.jpeg

Wang Junxing

Amazon Cloud Technology Edge Product Architect, responsible for the technical promotion of Amazon Cloud Technology Edge services in China. He has many years of practical experience in the field of CDN content distribution and WAF, focusing on edge service design and experience optimization.

389d735d00b2298c6a697d9cb16a4106.jpeg

Cui Junjie

Senior Product Solution Architect of Amazon Cloud Technology, responsible for cloud edge security related service products of Amazon Cloud Technology. Provide Amazon cloud users with product consultation related to DDoS defense/website front-end security defense/domain name security. Have an in-depth understanding of Cloudfront, Shield, WAF, Route53, Global Accelerator and other cloud edge security related products. Years of working experience in computer security, data centers and networking.

b557fc330b5993ac53369f1776a4dced.gif

e4afe4b78c17f145496c579f5e2b89c1.gif

I heard, click the 4 buttons below

You will not encounter bugs!

1e92ad40841cb479effa8b71b348f9c9.gif

Guess you like

Origin blog.csdn.net/u012365585/article/details/131733850
Recommended