Study Notes for "Technical Architecture of Large Websites"

Chapter 1: The Evolution of Large Websites

1.1 Characteristics of large-scale website software systems

  High concurrency, high traffic

  High availability: 7*24 hours uninterrupted service

  Massive Data

  Wide distribution of users and complex network conditions

  Bad security environment: large websites are attacked almost every day

  Rapid changes in requirements and frequent releases: large websites have new versions online every week

  Incremental development: large websites grow from small websites

 

1.2 Evolution and Development History of Large Website Architecture

Website architecture at initial stage

  Use only one server.

 

Separation of application services and data services

  Separate application, data, and file three servers.

  The application server needs to handle a lot of business, so it needs a faster and more powerful CPU.

  Data server fast disk retrieval and data caching, requiring faster hard disk and large memory

  The file server is used to store the user's files and requires a larger hard disk

 

Using caching to improve website performance

  Most business access is concentrated on a small subset of data.

  Local cache: fast, limited by application server memory

  Distributed cache server (cluster): use large memory server

 

Improve the concurrent processing power of your website using a cluster of application servers

  Improve the load pressure by continuously increasing the application server, and realize the scalability of the system.

  Distribute access requests to any server in the application server cluster through load balancing scheduling servers.

 

Database read-write separation

  After using the cache, a small number of reads and all writes still need to access the database

  Configure the master-slave database. When the application server wants to write data, it accesses the master database. The master database synchronizes data updates to the slave database through the master-slave replication mechanism.

 

Accelerate website response with reverse proxy and CDN

  The basic principle is caching. The difference is that CDN is deployed in the network provider's computer room. The reverse proxy is deployed in the website computer room.

  If the reverse proxy server caches the resource requested by the user, it will be returned directly to the user, reducing the pressure on the server.

  The purpose is to return data to the user as soon as possible.

 

Working with distributed file systems and distributed database systems

  Usually, business sub-databases are used to deploy data of different businesses on different servers.

  Distributed databases are only used when the scale of a single table is very large.

 

Using NoSQL and Search Engines

  通过一个统一的数据层访问各种数据, 减轻应用程序管理诸多数据库的麻烦。

 

业务拆分

  将整个网站业务拆分成不同的产品线。

  大型购物网站会将首页,商铺,订单,买家,卖家分成不同的产品线,由不同的团队负责。

 

分布式服务

  比如用户管理,商品管理这些业务,可以将共用的业务提取出来,独立部署

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325395516&siteId=291194637