Development cache architecture from the development of the site architecture (refer to "in-depth distributed cache from the principle into practice.")

Foreword

Views and opinions in this article is "deeply distributed cache from the principle into practice" on Jun Ze Cao Hongwei Qiu Shuo waiting [ISBN: 9787111585190] summarize and understand the contents of the first chapter of a book, the author recently in perspective cache architecture books, think this book is very good, architects from Jingdong, ants and many other large companies co-authored a book, the content is not detailed, but the combination of theory and practice to explain very interesting, is the content too difficult, I also a lot of places do not understand, but do not know for caching architecture but for those who want to build up knowledge of the frame buffer architecture, it is a good book to enhance the level below with the book and talk about their own understanding affect the development of the architecture of the site for caching architecture development

Talk about the development of the site architecture

The first since the black look, read the book, I now know the level of actually staying in a single system architecture level, can not even call it architecture, has been doing web development system, are written business code, because the user the amount is not large, with a single-system environment basic LNMP hold live, so deal with slightly larger traffic solutions often increase the bandwidth plus the configuration, that is, burn, how to solve the problem of large number of users accessing the system architecture to consider from the perspective of also It did not, the future need more practice

In the popular shopping district with a word called "cash is king", while the software is popular in your lap "Cache is king" argument, we can see the importance of the development of software systems for the cache, the cache architecture (I say here only software system cache) is also based on the development of the site architecture and developed, I summarize the development of the architecture of the site, which leads to the emergence and development of the necessary cache

(1) At the beginning of the site became popular, the site's architecture is the most simple single system architecture, that is, all the services are located on a single physical machine, for example, database services, web container services, infrastructure and environment. This time the site is just emerging, little traffic, LAMP integrated environment can support from the general number of visits;
(2) with the slow development of the site, the site's traffic increased, the original structure is subjected to a single system web container resource constraints and physical machines, access speed slows down, this time it was put into service split up on different physical machines, is to separate the database services and web services on different physical machines through networking, which solves part of the visits;
(3) but with the traffic continue to increase, such separation can not support traffic, so the cache appeared, the emergence of the cache is to reduce the pressure of server and database access, such as a static page cache , page on the visit of the cache to the client, to set the cache expiration time by response request expires header, page visits clients in the expiration time will automatically load the cache on the local page, thus saving the request on the network transmission time and processing time on the server to reduce the response time, which of course produce Data update issue that can be compared If-modified-Since header field Last-Modified header field and request a request to complete the update data through the response request, what specific principles do? It is actually very simple:
when ① the first request data, the client requests the Last-Modified response header assigned If-modified-Since, resources and returns
after ② when requesting resources, go check the cache without the cache ①, there is a cache ③
③ send a request to the server, the server to get authentication if-modified if-modified, is equal to the time stamp request to modify the file date is equal, then the current description are novel cache, the client simply returns the cached the data, if not equal then the cache is old, the server returns a new
(4) client-side caching can reduce the pressure on the server, but the site traffic increases brought a huge amount of data, however, the client cache is limited and tend to be small and the client's physical resources is limited, so these data can not be cached in the client, so the concept and implementation began cache server to the network and deepening from a client, such as nginx can be configured to cache static resources cached dynamic content, updates and more ways in fact almost , is to update the time stamp verification by, then later appeared data cache, the data in the database cache frequently accessed together, avoid frequent database access, and thus be cached in many sites in architecture is a multi-level, client caching multi-level cache architecture and server cache used simultaneously
(5) At the same time, and then solves the access part of the pressure by caching, but soon, application server, or slow down, because the physical is limited, no matter how good the cache structure, but there is a limit on the physical, so some people want multi increase the number of application servers to improve the physical limits, this is the service cluster; cluster simply, is a group of common principles and provide services to the user's computer without perception of the group, a cluster of interconnected through an internal network is through internally a copy of the key to the network share physical access to the machine pressure to meet the high-traffic, and cluster technology to achieve this is that we often hear of a reverse proxy and load balancing. Simply put, there is a need to service the cluster manager, cluster IP so many can not all be exposed to the user, but also to realize the rational allocation and use of resources, services, how to access permissions can not be handed over to the user to decide, so in order to user access to a single IP rational allocation of resources and the completion of the reverse proxy server and server load balancing concept came into being, such as high-performance web server Nginx reverse proxy and load balancing capabilities can be achieved, this server as other applications unified management server, be responsible for accepting user's request (reverse proxy) and a reasonable forward the request to the appropriate server process (load balancing) and unified response to the request (reverse proxy) according to the resource configuration of different servers in the cluster. On the basis of the cluster, to solve the traffic to the application program became continually adding servers to the cluster to improve performance. In fact, from here we can already see some problems on the cache, the cache is physically separate from the beginning, the question is how to bring the application server to synchronize data cache information? This time typically begin using cache synchronization mechanisms and shared file system or shared storage, etc.
Architecture (6) of the cluster may have solved a very high traffic, but as the traffic continues to rise, and expand business operations in different regions, increasing benefits generated by the cluster is not very obvious. Business expansion, bringing growing database system in a clustered mode, the kind of simple physical copy machine without changing the way business model obviously does not work, so beginning in the database library sub-table and the actual deployment of business different physical machines to different regions, i.e. distributed database systems began to appear. After the emergence of the concept of distributed, people began to think about the drawbacks of centralized services such as clusters, so distributed services also began to appear. Distributed Services is a complete system to split into multiple subsystems, respectively, to build sub-systems each separate but interconnected is, like each node on a circle, build the system within each node may be a cluster, multiple nodes together constitute a complete system, such a distributed architecture to solve at least three problems ① business expansion problem ② ③ service coupled with a huge problem in the traditional centralized structure in the amount of data storage problems. With the advent of such a distributed services, distributed cache came out, the cache can be synchronized by copying the physical machine in the cluster synchronization mechanism, but the data is too big, and not likely to be stored in the local cache now leads resynchronization, then distributed cache appears to transfer large amounts of data cache to the distributed cache up

So far, the development of the site architecture I have done a brief introduction, we can see the cached development is an important cross-cutting in the development of the web site architecture, as well as non-functional requirements of the site, the site can not be said to improve the performance of charge to, but it is definitely the most important part, the cache does good, it can reduce costs to improve performance by increasing the server

Multi-level cache of some common cache

If not involve hardware and operating system level cache, depending on the location in the software system, the cache can be divided into three:

  • Client-side caching
  • Network cache
  • Server-side caching

I have the following three aspects are briefly from a common cache, the author here is slightly summarize, wrong correct me hope

Client-side caching

  • Client page cache: here plus the prefix a client is to differentiate on the server side there is also a page cache concept refers to static pages or dynamic elements in the page cache, and here the page cache is refers to the client a cache the page itself, can solve the offline applications. This new feature in the H5, H5 page can be configured in a manifest attributes, and configure the manifest manifest file on the server side to achieve the client cache pages offline H5 applications. Personally I feel that this is not a lot, I also used to read the book to know.
  • Browser cache: In this section above I have explained, is to open a special browser space on your hard drive as a cache to store a copy of the resource, as updated strategy is I said above by If-Modified-Since and Last- Modified field to achieve

Network cache

At first glance this might think it will not be cached on your network router or switch? Of course not, the cache first, a small device on the network can not be used as a small web caching system, and second, network equipment does not have the service HTTP protocol, the protocol running on the network device is the network layer, and we need web HTTP protocol in the application layer, the network based on the above two points is certainly not mean a cache buffer on the network equipment. It refers to what? We can understand, where the network means: the distance between the end user from the application server. So this distance might use this cache is a proxy server. So I understand, network caching refers to the caching proxy server

  • Proxy cache: proxy here we know, is divided into two kinds of forward proxy and reverse proxy, the proxy like these two, the difference is different scenarios of different functions, forward proxy is to get outside the firewall client does not have access resources, and the need to make the appropriate client configuration, said a forward proxy server is an intermediary between the source server and the client is essentially; and reverse proxy is a proxy multiple servers in order to provide certain services to a client , which means that the client does not need to know what IP is really provide services, interfaces just to have a visit on the line, while the reverse is the agency provides this interface. Forward proxy and reverse proxy no connection on nature, but the name for some people easily think that two are linked, in fact, a completely different scenario. In fact, the way the proxy cache and client cache are not very different, but this is only in the network
  • Edge Caching: To quote the concept of "the book if they can do a reverse proxy server and users from the same network, the user accesses a reverse proxy server, you will get a response speed of a very high quality, so it can be such a called edge caching reverse proxy caches ", the most typical edge caching I am sure you have heard about, CDN cache, and this in fact is nothing more than reverse proxy cache as geographical conceptual distinction

Service-side caching

This is a relatively advanced caching, my limited capacity, only some introduction

  • Database Cache: This is well understood, is frequently visited the DBMS to select query results by hash cached, cache the next to go when looking at select, if the hash list there is a direct return, if not on the database access once. To Mysql, for example, under Mysql default, the cache is not open, you need to set their own parameters related properties of Query cache.
  • Platform-level cache: Here is my understanding that, also called a frame-level cache is implemented in relying on third-party frameworks, you do not need to manually manage, and the book examples Ehcache, JBoss Cache, etc. (this point is not very good , small talk, are interested can go to the original book to understand)
  • Application-level cache: application-level cache is required developers to implement caching mechanism through code, such as the use redis memory storage system to do application-level cache, or MongDB, Memcached and so can be used as tools for application-level cache
Published 65 original articles · won praise 58 · Views 100,000 +

Guess you like

Origin blog.csdn.net/AngelLover2017/article/details/83006455