Database development across the divide to talk about distributed database technology trends

  1. Restructuring the financial sector infrastructure needs
    with the development of mobile technology and the Internet, China's financial industry's business model and technology system has been gradually embarked on the Western world completely different path. As we all know, mobile penetration is far less than our European and American countries, while population base also has a number of different levels, which makes the type of business at home and abroad are facing the financial industry, the amount of data concurrency there is a huge difference, resulting in the entire IT infrastructure needs of different.

In the last couple of years, some domestic banks have pioneered technologically advanced services for micro and distributed technologies are explored, some of the new Internet financial services have begun to try to use the micro-service architecture, distributed technology, DevOps framework development and maintenance of applications. Some banks even in the planning of the next-generation core architecture, will try to introduce appropriate distributed architecture to meet business pressure and the amount of data growing demand in the future.

Compared with the new generation of distributed architecture, traditional middleware plus database of "chimney" structure, high concurrency, there are many problems when high response speed of business applications in the face of massive data.

• From the business point of view and systems, complex multi-enterprise business leads in the number of systems, decentralized, completely isolated data can not be shared;
• the lack of a flexible system level scalability, performance bottlenecks obvious, it is easy to encounter a hardware bottleneck, not elastic expansion to meet business needs
• the system can not respond quickly to the request to take advantage of the massive outbreak, such as instantaneous explosive growth during the two-eleven, spike and other services due to difficult to handle;
• procurement and operation and maintenance costs high, equipment and minicomputer hardware and software They are independent procurement operation and maintenance, resulting in a high total cost of ownership;
• a lack of self-control ability, highly dependent on foreign manufacturers, a local support team serious problem difficult to solve the problem in a short time, leading to an increased risk of manufacturing operations.

  1. Architecture bank Evolution
    During the past two decades, China's banking IT infrastructure has experienced changes in several stages. Our first-generation core banking system built on the mainframe, using the typical large centralized architecture. With the proposed SOA concepts, some banks have begun to gradually had to go to the big machine of the core banking system from the host or 400 moves UNIX minicomputer. Enhanced virtualization technology has led some banks and financial institutions to introduce virtualization mechanism in its infrastructure, the application development environment, as well as some of the production environment deployed on virtual machines.

Today, many banks have been distributed and PC server architecture based on the construction of a big data platform, while some micro-based application service system is even more the business logic of a container package, combined with distributed storage and database technology background, distributed architecture to achieve end to end system.

Just as many banks have experienced in the technology sector from large centralized hard core systems to SOA transformation, also face enormous challenges to the distributed architecture transition from the current system of minicomputers, such as the development choice of technology stacks, applications, and structures, such as DevOps system.

Application development from traditional architecture to a distributed, first to face the transformation of nature is the application framework. Today micro service framework is well established, representative architecture often include a protocol processing, service assembly, atomic services, as well as underlying persistence four. Business logic from the traditional single intermediate is disassembled into many micro service modules, each service module is constituted by a micro complete series of such containers can be simply achieved for the expansion of the service capabilities of the processing throughput by increasing the way the container.

But the split micro services means that each service has its own independent execution logic and memory. From a database perspective, the split micro database storage service system presents a great challenge. If each micro-data service will still be stored in conventional single point in the database, its storage and processing capacity are unable to provide the same scalability as the number of micro-service container rises. In this case, the database will become micro framework service performance and maximum scalability of the bottleneck.

And if each micro-services use a separate database for storage, the entire enterprise IT data architecture will become fragmented. The number of databases from past broken into hundreds of thousands database, the entire costs of operation and maintenance management team with a database procurement costs faced by lifting geometric progression.

Thus, the target not only as a traditional distributed Oracle database or a single DB2 Alternatively, the inventory does not fit the data into a plurality of physical data storage unit. In the real world, most banks have a relatively complete data lifecycle management strategy, generally do not accumulate a lot of historical data in the production environment, the amount of data is generally not the most important reason for using a distributed database.

  1. Distributed database system architecture
    core value is to provide a distributed database elastic expandable data services resource pool for distributed applications, it can also be called DBPaaS platform. Its main ability is to the upper tens of thousands from different developers, different business types, different SLA security levels, different types of micro-data service to provide an elastic expansion, high response speed, easy to maintain database service platform, and must supports high availability configuration data between different micro-services, disaster recovery strategy definition, multi-tenant, business logic data physical isolation, isolation trading mixed mode analysis, a series of data isolation and governance mechanisms such as hot and cold data isolation.

Some micro-enterprise services architecture of the Internet, more than 20 people database operations teams can support hundreds of thousands of different database instances, most operation and maintenance of the core business is to build a unified DBPaaS platform, distributed database through the fault of self-healing, elastic expansion mechanism simplifies the operation and maintenance of large-scale personnel management database.

Current industry there are many distributed database products, mainly divided into three architecture system.

 Application of Vertical split
application vertical split is one of the most traditional distributed concept. One implementation is to apply disassembled into a plurality of independent sub-services, each partial data corresponding to the whole; another implementation, the database is connected to a plurality of docking service within the application according to the service rules for selecting the data source. For example, the application ID based on user account segmentation, the user ID is present in the database A 1-1000000, the present database B from 1000001 to 2000000, and so on.

The mechanism by default within the application a rule, each data access library is first screened from the rule database instance target is located, and then connected directly to obtain access. Using this mechanism, on the one hand cross-database transaction is extremely difficult to achieve, on the other hand from the application, the service distributed invasive ability is very strong, a lot of customization required to complete the development of basic business logic, while each expansion of the application logic required to do a complete end-to-combing, there may be a number of risks and secondary development.

 middleware sub-library sub-table
with the needs of the popularity of distributed storage capacity needs, the industry began the gradual emergence of another type of technology system, called middleware sub-library sub-table. Such thinking technology system is between the application and the database SQL parser to build a service, the traditional SQL parsing and then translated into the underlying database corresponding to each sub-query, then the query directly under the traditional underlying issue database for execution.

The advantage of this mechanism is that data storage can continue based on the traditional relational database unchanged, while the upper application program interface for a certain degree of encapsulation. However, the mechanism of the middleware sub-library sub-table from the industry point of view, can be considered a single point of transition from traditional databases to distributed database transitional phase. Before building the new PC-based server distributed database popularity, a number of data-hungry applications can be split to ease the traffic pressure skyrocketing amount of data this way, but in the future the native mature distributed database and its advantages will be verified after the it is difficult to maintain. At the same time, the technology for the application can not be 100% complete transparency, in general, need to specify some parameters or use a more unique application assembly in SQL syntax, it is hard to do without completely transparent to the application-aware.

 native distributed database
is different from the sub-library sub-table middleware technology, distributed database native PC-based server directly reconstructed from the underlying storage engine, from the data storage structure, data security, distributed transaction control and other field optimized for distributed storage and execution.

Native fully distributed database is the underlying research and development from scratch, completely abandoned minicomputer systems, PC-based server hardware architecture design of distributed database, high availability, disaster recovery, distributed and other natural mechanisms into all aspects of data storage systems. Instance, some distributed database product able to achieve compatibility with MySQL 100% of the premise, to realize completely transparent to the application and implementation of distributed storage capacity. From a developer's point of view, the user does not need to completely focus on the presence of hundreds of millions or billions a record sheet, configure the maximum physical capacity and resources to build the table when consumed as long as the strategy, data is automatically carried out in a cluster of multiple physical devices balanced, from the application point of view as directly read and write requests as access to standard tables.

  1. Native distributed database technology trends
    in order to support the future of IT services micro framework, the introduction of distributed transactional databases need to be evaluated from the traditional technology compatibility, and new technologies forward-looking two dimensions.

SQL support and the integrity of the support ACID is to assess whether a new type of distributed database capable of providing two key metrics compatible with traditional database technology.

• ACID support
from the security point of view, regardless of the introduction of new technology or conventional technology, good data is essential not to lose the foundation of all databases. In a distributed database industry, a number of Internet technology for the design of distributed products (Partition Tolerance) plus high availability (Availability) as a target in the security consistency (Consistence) can not guarantee the correct data, it is difficult in the financial business being widely used. Therefore, the interest banks new distributed database must first ensure the safety and consistency of the data, which distributed transaction, distributed lock, four isolation levels of technical support are all key points in the index.

• SQL integrity support
SQL Integrity refers to the development of friendly new distributed database with traditional relational databases. The more sophisticated distributed database, SQL syntax that can do more compatible with traditional relational database, and its data segmentation is more transparent to the application. Today, most are known as distributed database technology support MySQL syntax, but the new mainstream applications are also supported by the MySQL database as its default options. Therefore, the strength of MySQL syntax protocol support has become crucial to judge the integrity of distributed SQL database support.

The new technology is a distributed database refers to forward-looking and future development methods and IT infrastructure is consistent.

• Distributed and elastic scalability
as a data service resource pool, distributed database must be done to expand elastically, in order to serve the growing upper micro-service type and quantity. For each micro-services simultaneously, its data is stored in a physical device or a plurality of physical devices, which must be completely transparent to the application code.

• Multi-mode engine
serving the upper from different developers, different business scenarios, different types of micro-data services, distributed SQL database necessarily need to support multiple protocols and computing engine. Engine from a storage point of view, the structured and semi-structured data may be simultaneously used in the application. Therefore, the need for a new generation of distributed database support multimode (Multi-Model) engine from the access interface to the storage structure.

HTAP • (Hybrid of the Transactional / Analytical Processing)
HTAP namely hybrid trading analysis capabilities. In the traditional IT architecture banking, online trading and statistical analysis systems often use different technologies and physical device, regularly performed by ETL to migrate online transaction data to the analysis system. And as a data service resource pool, the same data may be shared by different types of micro-access services. While some online business transactions with the audit classes run simultaneously for the same data, it must ensure that a request to perform the complete isolation of the physical environment, so that the transaction analysis business without interference.

Overall, distributed database technology trends need to be judged from the traditional technology compatibility and new technologies forward-looking two dimensions, which ACID SQL data security and integrity is an important indicator of the traditional technology compatibility, scalability and elasticity, multi-mode engine, and new technologies are forward-looking HTAP several important measures.

  1. Financial distributed database application scenarios
    of the current financial industry, the distributed database has been applied in five major areas: data warehousing, big data platform, content management platform, data sets, and on-line transactions. For Online distributed database, the current industry revolves around three business scenarios.

 online trading system
online trading system is an important production run banking environment. Some distributed technology to explore the country at the forefront of the bank, has begun to gradually move the core business processes distributed system environment from IBM and Oracle's large machines with small machine architecture, so that clusters can elastically expand, explode at any time to meet the business growth in demand. Some typical use of the distributed database system including net loan core, channel integration, credit card bonus points and so on.

 data sets
Today, many companies made heavy in the table, light the front desk IT infrastructure. The data in the table as the key enterprise IT data integration platform, foreground flexible business needs, with relatively fixed background combining data model, played a "data aggregation, connecting the front and rear" role. For example, the first bank to be able to produce thin as a target system, check printing from a single chronological history began, gradually extended to user portrait view assets and other quasi-real-time data services.

 content management platform
traditional content management platform mainly after Governor audit conducted for the purpose of construction, front-end business fundamental not directly involved in the use of unstructured data. With the popularity of self-service devices and mobile applications, more and more directly involved in the process need to deal with unstructured data. Therefore, many banks are content management platform from the back end to the front end of the past, a large number of customer applications directly connected to the content management platform, a number of accounts, credit, and even self-service equipment in a large number of processes are highly dependent on real-time interactive capabilities of content management platform, so that the content management system from the traditional internal audit background to the external online service.

It can be seen as a class of business scenarios for off-line analysis, distributed databases in the bank have already been widely used. And for an online business is, MPP data warehouse and big data platform in terms of reliability, concurrent capacity, and speed of response are unable to meet demand.

Summary
Today, some of the deeper study of distributed technology bank, has begun a pilot application for distributed databases. The core value of the distributed database is not only the traditional inventory data does not fit the data across multiple physical storage devices, more important is the application development model for the future of micro-services, face, different SLA levels from different developers, different high availability disaster recovery characteristics, different types of data traffic, a resiliently expanded, the multi-mode interface data service platform (DBPaaS).

Current scientific and technical personnel often asked a question: whether to replace Oracle distributed database in the future? The answer to this question can be very intuitive. Distributed application framework and PC server cluster must be the direction of future IT development, and micro-services software architecture to replace the chimney, then you need to transfer the database from the traditional "point" to the platform of "face." Each application there is a corresponding iterative cycle, can now already see a lot of applications are beginning to MySQL and other open source database as its own database options supported by default, you must use Oracle's future scenarios will also be less and less.

Therefore, Oracle distributed database in the future will replace the traditional single-point database. Bank of science and technology departments should as soon as possible for distributed database technology forward ×××, to adapt to the future trend of bank IT infrastructure transformation from the chimney to the micro-service mode.

Guess you like

Origin blog.51cto.com/13722387/2414072