Design and practice of storage architecture for tens of millions of orders per day | JD Logistics Technology Team

1. Overview of the order system

1.1 Business scope

Service business lines: express delivery, express delivery, small and medium-sized items, large items, cold chain, international, B2B contract logistics, CLPS, Jingxi, three-in and three-out (purchasing in, returns in, allocation in, sales out, supply withdrawal, allocation out) )wait

1.2 Order center value

1. Decoupling ( improving system stability )

Original system: Transaction and production are coupled together, and new business requirements involve multiple upstream and downstream systems. ECLP, foreign orders, waybills, terminal systems, etc. The logic of multiple business lines is coupled together, and changes in the requirements of a single business line involve the associated transformation of other business lines in the original system.

New system: decoupling of transactions and production operations: transaction-related demands are resolved within the order domain; production-side demands are resolved within the production domain, reducing upstream and downstream interactions.

Business line coupling: Different business lines have different business processes. Changes in the requirements of a single business line will only be iteratively updated in the specific process and will not affect other business lines. Improve the stability of the entire process and business.

2. Improve the access speed of new services

The order center provides reusable standard capabilities to the front desk to increase the speed of new business introduction.

The order center splits and abstracts large applications in the original system into multiple small application combinations, and supports on-demand orchestration of business processes in different scenarios. By reusing the public standard capabilities of the central platform, new businesses can quickly access the order center and avoid duplication of the same functions.

3. Provide a global unified data model

Original system: Orders belong to multiple systems, including external orders, ECLP, and large-scale systems. There are multiple sets of databases, and the business semantics are not unified, making it inconvenient for data construction.

New system: The order center uniformly defines the standard data model of orders, allowing data from different businesses to be stored in the same system, reducing the duplication of functions related to the order domain, avoiding waste of resources, and breaking down departmental barriers. This allows data and processes to be centrally managed and optimized, providing standard data in the order domain for group business analysis and prediction of JD.com’s future innovation space .

2. Architecture introduction

2.1 Overall architecture design

Through the technology middle-end architecture upgrade project, the trading system is rebuilt with a new four-layer architecture of access-transaction-performance-execution. The transaction order is responsible for closing the document flow of the logistics service contract between logistics and customers, and also carries the responsibility of distributing to the downstream OFC (order fulfillment layer).

2.2 Real-time data layer architecture design

2.2.1 System interaction diagram

System interaction is as follows:



The standard interface of the order center has document closing in the upper layer, and we have also made unified closing in the data layer.

Decouple business architecture from data , and separate high-availability and high-performance designs such as distributed databases, caches, and consistency from the scope of business architecture, allowing the business architecture to focus on the business itself.

Persistence system : used to support data persistence such as order receiving, order modification, order cancellation, order deletion, etc.

Search system : Provides services such as order details inquiry, order list inquiry, order status flow inquiry, and judgment of whether Baichuan orders are made.

Relay system : data hub, which writes order data to Elasticsearch, HBase, and MySQL through the consumption message queue.

Data reconciliation system : used to compare the data consistency of multiple sets of storage middleware to ensure the final consistency of the data.

Data synchronization system : Synchronize the query conditions and list display fields required for order list query from the old system to the order center, which is used to solve the problem of difficult paging due to the order data existing in the old and new systems during the cutting process.

2.2.2 Technical architecture diagram



[Read-write separation architecture] Adopts read-write separation architecture mode (CQRS) to separate order read and write traffic to improve query performance and scalability, while achieving read and write decoupling.
[Caching] Use distributed cache Redis to cache popular order data and order-related information to improve concurrency and response speed and reduce access to HBase . At the same time, three sets of high-performance caches, primary, backup, and temporary, are used to improve system disaster tolerance.
[Message Queue] Use message queue JMQ to implement asynchronous processing of orders to improve system throughput, while peak traffic reduction reduces the pressure of direct requests to ES, HBase, and databases. Isolating different business scenarios (such as orders and return) using different Topics can facilitate better management and maintenance; using different Topics to isolate different businesses can achieve parallel processing and horizontal expansion of messages, and improve the throughput of the system. and performance.
[Complex query] Use the search engine Elasticsearch to solve complex query of orders. First obtain the order number through Elasticsearch, and then query the distributed cache Redis + columnar database HBase based on the order number.
[Low-cost persistent storage] Uses HBase columnar database to support massive data scale storage and strong scalability.
[Data consistency] is achieved through strong transactions, eventual consistency, idempotence, compensation, distributed locks, version numbers, etc.
[Multi-tenant architecture] The system adopts a multi-tenant data model to store tenant data separately to ensure data isolation and security. Dynamically expand the system's capacity and resources according to the needs of different tenants, which can support horizontal expansion of the system. By sharing infrastructure and resources, multi-tenant architecture enables higher resource utilization and lower costs.

2.3 Design advantages

2.3.1 High availability

Application servers, MySQL, Redis, HBase, JMQ, etc. are all deployed across computer rooms; ES single computer room deployment, build ES master and backup dual computer room clusters
Isolation, current limiting, fusing, peak clipping, monitoring

2.3.2 High performance

High performance caching
Asynchronous

2.3.3 Massive data processing

Sub-database and sub-table
Hot and cold separation
Column storage ( HBase)

2.3.4 Data security

Sensitive information is encrypted and stored. Log, Redis, ES, MySQL, HBase, etc. all use encrypted storage. "Whoever stores it encrypts it, and whoever uses it decrypts it."

3. Order data model

3.1 PDM model

In order model design, based on the principles of unified business attributes, abstract general models, and induction of common entities, the order model is mainly divided into the main file information of the order, the product information of the order, the logistics service information of the order, the marketing information of the order, and the order model. financial information, order customer channel information, order receipt and delivery information, order operation information, order extended information, etc.







3.2 Model scalability

3.2.1 Extensibility design of standard model

There are dozens or hundreds of identification fields in the order. If new fields are used every time, the order business attributes and data model will be greatly expanded, corroding the model. At the same time, the development efficiency will be low, so the KV format is used to undertake and store it. Divide the identification into various business domains, such as order identification, product identification, marketing identification, etc.

3.2.2 Personalized business model scalability

For personalized business, a set of configurable database field management solutions are provided. Through some out-of-the-box settings, when orders are submitted, modified, and queried, different data models can be found based on business identity + business type + business fields. And data expansion encoding, that is, finding which table and field to store. N extended attributes are reserved in each table. For the same extended attribute, different business identities + business types represent different meanings to achieve extended storage.



4. Future and Challenges

4.1 Order personalized inquiry

The demand for personalized queries increases, such as fuzzy queries, real-time aggregation based on query conditions, etc. If the ES indexes are placed in the same cluster, it will affect the overall cluster stability, but after splitting, the business data cannot be queried and displayed together with other businesses. .

4.2 Unitized architecture

The current order persistence TP99 is 47ms, and the TP99 is 20ms when it is not across computer rooms. From the data point of view, cross-computer rooms have a great impact on performance.

Unitization allows related requests from the same user to be completed in a "closed loop" of all services in one computer room, eliminating "cross-computer room" access . The unitized deployment method allows each computer room to be deployed in any area and new computer rooms can be expanded at any time. Through unitization, we will continue to strengthen the foundation of the order platform.

4.3 Hardware cost control

The average daily order volume continues to rise, and the amount of data is getting larger and larger, followed by an increase in hardware costs. How to control the increase in hardware costs is a challenge now and in the future. We plan to reduce data storage costs through data archiving, hot and cold data tiering, and other methods .

Author: Wang Weidong of JD Logistics

Source: JD Cloud Developer Community Ziyuanqishuo Tech Please indicate the source when reprinting

Fined 200 yuan and more than 1 million yuan confiscated You Yuxi: The importance of high-quality Chinese documents Musk's hard-core migration server Solon for JDK 21, virtual threads are incredible! ! ! TCP congestion control saves the Internet Flutter for OpenHarmony is here The Linux kernel LTS period will be restored from 6 years to 2 years Go 1.22 will fix the for loop variable error Svelte built a "new wheel" - runes Google celebrates its 25th anniversary
{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/4090830/blog/10114003