www.wikipedia.org
Compared with Baidu and Google at the same traffic level, the market value behind it is tens of billions of dollars, the employees are tens of thousands, there are countless servers, hundreds of servers, and more than ten maintenance personnel.
The website is built on LAMP.
Architecture Components:
GeoDNS: resolves domain names to the server closest to the user
LVS: Linux-based open source load balancing server
Squid: Linux-based open source reverse proxy server
Lighttpd: open source application server (lighter and faster, many websites use it as image server)
PHP: Web Application Development Language
Memcached: Open Source Distributed Cache System
Lucene: Open Source Full Text Search Engine
MySQL: Open Source Relational Database Management System
Performance optimization strategy:
Front-end performance optimization:
(Website front-end, generally including DNS service, CDN service, reverse proxy service, static resource service, etc.)
CDN service (cache hot entry content pages, deployed in the place closest to the client's browser)
-->LVS (load balancing) --> Reverse proxy server Squid cluster (core, cache hotspot entries)
-->LVS (Load Balancing) -->Apache Application Server Cluster
CDN Caching Guidelines:
1. Content pages do not include dynamic information
2. The content page has a unique REST style URL
3. The HTML response header writes the cache control information
Server performance optimization:
PHP server (hardware improvement)
APC (PHP bytecode cache module, accelerates code execution and reduces resource consumption)
Imagemagick (image processing conversion)
Tex (for text formatting, especially converting scientific formula content into image format)
Replace the PHP string query function strtr() and refactor it with a more optimized algorithm
Backend performance optimization:
Backend services (including cache, storage, database)
Main means: use cache
Cache usage policy:
1. The data in the hot spot is directly cached in the local memory of the application server
2. The content of the cached data should be in a format that the application server can directly use, such as HTML
3. Use a cache server to store session objects
4. Memcache persistent connection
MySQL optimization:
1. Use larger server memory
2. Use RAID0 disk array to speed up disk access (reduce persistent reliability of database, make up means, master-slave replication, asynchronous database backup, etc.)
3. The database transaction consistency is set at a low level to speed up the recovery from downtime
4. If the master database is down, immediately switch the application to the salve database and close the database write service.
(One step back in business, one big step forward in technology)