Case 2: Evolutionary Design of Wikipedia High-Performance Architecture

www.wikipedia.org  

     Compared with Baidu and Google at the same traffic level, the market value behind it is tens of billions of dollars, the employees are tens of thousands, there are countless servers, hundreds of servers, and more than ten maintenance personnel.

     The website is built on LAMP.

     Architecture Components:

            GeoDNS: resolves domain names to the server closest to the user

            LVS: Linux-based open source load balancing server

            Squid: Linux-based open source reverse proxy server

            Lighttpd: open source application server (lighter and faster, many websites use it as image server)

            PHP: Web Application Development Language

            Memcached: Open Source Distributed Cache System

            Lucene: Open Source Full Text Search Engine

            MySQL: Open Source Relational Database Management System

      Performance optimization strategy:

      Front-end performance optimization:

             (Website front-end, generally including DNS service, CDN service, reverse proxy service, static resource service, etc.)

              CDN service (cache hot entry content pages, deployed in the place closest to the client's browser)

              -->LVS (load balancing) --> Reverse proxy server Squid cluster (core, cache hotspot entries)

              -->LVS (Load Balancing) -->Apache Application Server Cluster

              CDN Caching Guidelines:

                      1. Content pages do not include dynamic information

                      2. The content page has a unique REST style URL

                      3. The HTML response header writes the cache control information

       Server performance optimization:

              PHP server (hardware improvement)

              APC (PHP bytecode cache module, accelerates code execution and reduces resource consumption)

              Imagemagick (image processing conversion)

              Tex (for text formatting, especially converting scientific formula content into image format)

              Replace the PHP string query function strtr() and refactor it with a more optimized algorithm

       Backend performance optimization:

              Backend services (including cache, storage, database)

              Main means: use cache

              Cache usage policy:

                     1. The data in the hot spot is directly cached in the local memory of the application server

                     2. The content of the cached data should be in a format that the application server can directly use, such as HTML

                     3. Use a cache server to store session objects

                     4. Memcache persistent connection

              MySQL optimization:

                     1. Use larger server memory

                     2. Use RAID0 disk array to speed up disk access (reduce persistent reliability of database, make up means, master-slave replication, asynchronous database backup, etc.)

                     3. The database transaction consistency is set at a low level to speed up the recovery from downtime

                     4. If the master database is down, immediately switch the application to the salve database and close the database write service.

                          (One step back in business, one big step forward in technology)

 

 

 

 

 

 

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326943958&siteId=291194637