Exploration and Practice of JD Unified Head and Tail Management System

System background

Q: How long does it take to revise the copywriting of a website? For a small personal website, it is estimated to be very simple, and it can be modified and published in a few minutes. But what if you want to modify the copywriting of hundreds of websites? Then it is estimated that it is necessary to raise demand for products, schedule development for R&D, and test for regression verification. Due to the large number of applications involved, and each application has its own research and development needs, it may not be possible to quickly schedule copywriting revisions. So it seems like a very simple requirement, but when there are many applications and departments involved, it becomes a nightmare for product managers. Especially for large shopping websites like Jingdong Mall, there are many business systems behind every webpage you browse, and it is maintained by a dedicated R&D team. In order to maintain a unified web page style, each business system often uses the same page header and tail, which we call common header and tail.

For example, the picture above is the tail of the page that JD.com currently uses uniformly. If you want to modify the copywriting or links at the tail, you need to push hundreds of systems and R&D teams to schedule revisions and launch them. In order to solve this problem, JD.com’s unified header and tail management system was born, which basically realized five-minute modification of JD.com’s public header and tail content.

At present, the achievements of the unified head-to-tail system are as follows:

Overall System Architecture Design

The whole system mainly includes two parts. The first part is the management background, which is mainly used to manage Jingdong’s public header and tail files and business systems, configure the relationship between business systems and public header and tail files, and distribute public header and tail files for business systems. . The second part is the header and tail client, which is mainly used to obtain the header and tail files that the business system relies on, then parse and render the page, and output the contents of the latest version of the header and tail files. In order to cope with business systems developed in different versions of languages, the head and tail clients are further divided into Java clients and Nginx clients. The Java client mainly supports business systems developed in the Java language. It can not only parse and process static HTML, but also support parsing page template engines such as JSP/Velocity/FreeMarker/Thymeleaf. The Nginx client supports business systems developed in non-Java languages, and implements the functions of parsing page templates and rendering public headers and tails of non-Java systems.

Management background design and implementation

The entire management background realizes the separation of the front and back ends, the back end is responsible for providing the HTTP interface, and the front end is only responsible for page rendering. The management background is divided into modules, mainly divided into three modules, including file management module, application management module and personal center module.

file management module

Provides the maintenance function of public head and tail files, which can create and save the public head and tail HTML content in the management background, and implements version control for the public head and tail files. Users can edit, publish and roll back the head and tail files in the management background and so on.

Application Management Module

Provides the maintenance function of the business system. Users can add new applications in the management background, create a configuration environment, add public head-to-tail configuration relationships that business systems rely on, and view application information and head-to-tail client request information accessed by business applications.

Personal center module

It is used to record various operation logs of management background users, including file operations and application operations, and provides operation log query function. It also performs online approval processing for the release operation of public header and tail files.

Head and tail client design and implementation

The head and tail management background introduced earlier has realized the creation, maintenance and version control of the head and tail files, and how the business system relies on referencing these head and tail files is the problem we need to face in the next step. First of all, the problem we need to solve is how to distribute the head and tail files created in the head and tail management background to each business system. At present, there are mainly two methods, namely the push method of the head and tail system and the pull method of the business system.

Head and tail system push method:

It means that each server in each business system needs to be used as the server, and the head and tail systems are used as the client. When the head and tail files in the head and tail systems are updated, they will actively connect to the servers of each business system. After the connection is successful, the The latest content of the head and tail files is sent to the business system. In order to ensure that the head and tail system clients can establish connections with each business system server at any time, the business system needs to monitor a fixed port and provide services at all times, otherwise there will be a risk of failure to push the head and tail files. The real environment is that JD.com has many business systems, and the deployment environment is also diverse, as well as different development languages. If you want to develop the head-to-tail server, you must first solve the problem of cross-language. JD.com currently uses Java, Js, Php, Golang, and Lua for development. We need to provide and maintain the head-to-tail server versions in five languages. Moreover, due to the large number of ports monitored by the business system, the head and tail servers will face the risk of port occupation when starting, which will also cause the head and tail servers to fail to start normally, so that the head and tail files cannot be updated. However, this method also has advantages, that is, only when the header file needs to be updated, a connection is established to push it. The header and tail files can not only be updated and take effect in real time, but also save server resources.

Business system pull method:

This method is just the opposite of the head-to-tail system Push method. Using the head-to-tail system as the server can solve the problem of failure to start due to port occupation, but it still faces the problem of cross-language client versions. However, through our research and analysis of business systems, basically all business systems use Nginx as a reverse proxy, and this Nginx gives us the possibility to support cross-language business systems. Then you only need to develop a Java version of the head and tail client to introduce and pull the head and tail files to the Java business system. However, this method also has disadvantages, that is, the head and tail clients do not know when the head and tail files will be updated, and the head and tail clients can only regularly poll the head and tail systems to check whether the head and tail files are updated, and if the files are updated, they will be pulled New header and tail file content. This will cause the head and tail files to not be updated in real time, and regular polling will also consume certain server and network resources.

Finally, after comprehensive consideration, we chose the pull method of the business system to distribute the head and tail files. In order to solve the cross-language problem of the business system, we provide two versions of the head and tail client, namely the Nginx head and tail client and the Java head and tail client, which basically meet the pull function of the head and tail files of all business systems. But how the business system references these header and tail files involves an SSI (server-side web page inclusion) technology. The following will introduce how the head and tail clients of the two methods can solve the problem of pulling the head and tail files and SSI.

Nginx head and tail client

This method mainly uses the SSI module of Nginx to pull the head and tail files and the SSI problem. The ngx_http_ssi_module module is a filter in Nginx, which processes the SSI (server-side inclusion) command in the response passing through it. The included command is currently used, and the configuration example is as follows:

  <!--# include file="/fragment/footer.html" -->

The pages in the business system refer to the head and tail files maintained in the head and tail system through this configuration instruction, but this configuration instruction requires that these files actually exist on the server before they can be loaded and replaced by Nginx. Therefore, this configuration alone cannot import the header and tail files configured in the header and tail system. It is also necessary to convert the file name introduced by the include command into a URL, and then go to the header and tail system server to request the header and tail files of the corresponding version. So Nginx's URL rewriting and reverse proxy configuration are used here to solve the problem of pulling the head and tail files. At this point, in fact, a complete header and tail file SSI function has been realized, and the page containing the header and tail files can also be fully displayed in the business system.

However, there are still some problems. The header and tail file requests here are passively triggered when the user browses the page, and the header and tail files are also requested by Nginx through the reverse proxy synchronization, so the response time of the header and tail systems directly affects the pages of the business system. Loading time, if the head and tail systems time out, the business system pages will also time out; and the page traffic of the business system (including JD homepage, business details page) will all hit the head and tail systems, which is the traffic that the head and tail systems cannot bear. Therefore, we need to reduce the request volume of the business system, and the content of these header and tail files does not change too frequently, so Nginx can be used to add a local cache proxy_cache. When the user browses the page of the business system, the head and tail files in the local cache are requested first, and after the cache time expires, the head and tail system is requested to obtain the latest head and tail files. Through the configuration of Nginx proxy cache, thousands of user requests are optimized to only one request within the server cache time of each business system, which greatly reduces the request pressure of the head and tail systems. At the same time, the proxy_cache_use_stale configuration is used to reduce the risk of the business system's dependence on the head and tail systems. Even if the head and tail systems are down, it will not affect the loading and display of the head and tail files of the business system. The following is a partial configuration example of the Nginx client:

location ~ ^/fragment/ {
    proxy_cache header_cache;
    proxy_cache_key $uri;
    proxy_cache_valid 200 1m;
    proxy_cache_use_stale error timeout updating http_500 http_502 http_503 http_504;
    proxy_cache_lock on;
    proxy_cache_lock_timeout 1s;
    proxy_connect_timeout 1s;
    proxy_ignore_headers Set-Cookie Cache-Control;
    proxy_hide_header Cache-Control;
    proxy_hide_header Set-Cookie;
    # 参考头尾系统中配置，请注意区分测试环境和生产环境，返回的文件内容默认都是UTF-8编码内容，如果需要GBK编码内容需要在env后面拼接参数?charset=GBK
    # 只需要替换{appId} {token} 和 {env}
    rewrite ^/fragment/(.*) /open/fragment/$1/Nginx/$nginx_version/$server_addr/{appId}/{token}/{env} break;
    proxy_set_header Accept-Encoding "";
    add_header X-Cache-Status $upstream_cache_status;
    proxy_pass   http://xxx.jd.local;
}

Java head and tail client

Although the Nginx method client introduced earlier has solved the SSI problem of the header and tail files, because the SSI process of Nginx is triggered when the user visits the page, it is a synchronous call during the user request process. Even if the local cache is added, the It will still affect the response time of the page. So in order to solve this performance loss problem, we specially developed a Java version of the head and tail client to realize the SSI function of the head and tail files. The startup process of the head and tail clients is as follows:

First, the business system needs to introduce the Java head and tail client to rely on the jar package, and then configure the application ID, access token, environment identifier, page template path and template file suffix name in the head and tail system. After the configuration is complete, the head and tail clients will start together with the business system. During the startup process, the head and tail client will first download the head and tail files configured in the head and tail system to the local directory of the business system server, and then start an asynchronous thread to poll the request head and tail system and detect the update of the head and tail files , if there is a change in the head and tail files, directly download the latest head and tail files to the local directory.

After the head and tail files are downloaded locally, the head and tail client will scan and load all template files containing SSI instructions into memory according to the page template path and suffix name in the configuration, and create backup files of these template files. Then parse the include command in the template file according to the template file loaded into the memory, and finally load the contents of the head and tail files through the file name configured by the included command for replacement, thereby generating a new template file. After the template parsing is completed, register and start a header and tail file observer, which is specially used to monitor whether the header and tail files are updated. If there is an update, the template content in the memory is parsed again to generate a new template file. This process is basically carried out when the business system is started, so when the user requests the business system page, the business system can directly return the template file, avoiding the SSI processing during the user request process, and basically realizing the business Zero loss of system performance.

Author: Jingdong Retail Cao Zhifei

Source: JD Cloud Developer Community