Chapter 1: The Evolution of Large Websites
1.1 Characteristics of large-scale website software systems
High concurrency, high traffic
High availability: 7*24 hours uninterrupted service
Massive Data
Wide distribution of users and complex network conditions
Bad security environment: large websites are attacked almost every day
Rapid changes in requirements and frequent releases: large websites have new versions online every week
Incremental development: large websites grow from small websites
1.2 Evolution and Development History of Large Website Architecture
Website architecture at initial stage
Use only one server.
Separation of application services and data services
Separate application, data, and file three servers.
The application server needs to handle a lot of business, so it needs a faster and more powerful CPU.
Data server fast disk retrieval and data caching, requiring faster hard disk and large memory
The file server is used to store the user's files and requires a larger hard disk
Using caching to improve website performance
Most business access is concentrated on a small subset of data.
Local cache: fast, limited by application server memory
Distributed cache server (cluster): use large memory server
Improve the concurrent processing power of your website using a cluster of application servers
Improve the load pressure by continuously increasing the application server, and realize the scalability of the system.
Distribute access requests to any server in the application server cluster through load balancing scheduling servers.
Database read-write separation
After using the cache, a small number of reads and all writes still need to access the database
Configure the master-slave database. When the application server wants to write data, it accesses the master database. The master database synchronizes data updates to the slave database through the master-slave replication mechanism.
Accelerate website response with reverse proxy and CDN
The basic principle is caching. The difference is that CDN is deployed in the network provider's computer room. The reverse proxy is deployed in the website computer room.
If the reverse proxy server caches the resource requested by the user, it will be returned directly to the user, reducing the pressure on the server.
The purpose is to return data to the user as soon as possible.
Working with distributed file systems and distributed database systems
Usually, business sub-databases are used to deploy data of different businesses on different servers.
Distributed databases are only used when the scale of a single table is very large.
Using NoSQL and Search Engines
通过一个统一的数据层访问各种数据, 减轻应用程序管理诸多数据库的麻烦。
业务拆分
将整个网站业务拆分成不同的产品线。
大型购物网站会将首页,商铺,订单,买家,卖家分成不同的产品线,由不同的团队负责。
分布式服务
比如用户管理,商品管理这些业务,可以将共用的业务提取出来,独立部署