What is the batch performance of Oracle data warehouse under hyper-converged architecture?

Preface

Hyper-converged architecture has been recognized by more and more customers due to its advanced distributed architecture, software-defined model and converged deployment features, and its application scenarios have gradually covered most production businesses. However, in some important business scenarios, users still have doubts about the applicability of hyper-converged architecture. To this end, we have compiled the applicability verification of hyper-converged solutions for business scenarios jointly carried out by SmartX's front-line technical team and industry users. We hope that the relevant data and conclusions can provide useful reference for industry customers' IT infrastructure transformation.

This time, we will introduce SmartX hyper-convergence’s verification of batch performance support and optimization for Oracle data warehouse.


[Recommended Materials] Scan the QR code or click the link to get  a collection of articles exploring SmartX financial core production business scenarios , and gain an in-depth understanding of the specific performance and customer benefits of SmartX hyper-convergence in key application scenarios such as TA registration, O32 investment transactions, and BI reports.

 

Introduction to data warehouse

Data Warehouse is a data management system for analysis and reporting. Typically, data flows regularly from transactional systems, relational databases, and other sources into data warehouses, which centralize and integrate large amounts of data from multiple sources. Business analysts, data engineers, data scientists, and decision makers across the enterprise access data through business intelligence (BI) tools, SQL clients, and other analytical applications to gain valuable business insights to improve decision-making.

Background of the project

A financial customer uses Oracle data warehouse to provide data processing and data processing for the Crystal Reports system in the production environment. With the growth of business volume and data volume, the performance of Oracle data warehouse batch running is getting worse and worse. Currently, batch running work starts from early morning to end at 6 am. Considering that the data volume will continue to increase in the future, it may continue Increased batch processing time will affect the normal development of business during the day. Therefore, customers hope to optimize the infrastructure and shorten the batch running time of the data warehouse.

At the same time, the data warehouse of the production environment is also deployed on IBM minicomputers. Although the operation is relatively stable, due to the long service life of the minicomputers, considering the equipment operation risks and operation and maintenance complexity, the customer hopes to migrate the minicomputers to x86 . Possibility verification to prepare for future business growth.

Based on the above two reasons, the customer hopes to use SmartX hyper-convergence to test the batch performance of Oracle data warehouse.

Test objectives and methods

This test is to verify the batch performance of Oracle data warehouse under hyper-converged architecture. When the amount of data is exactly the same, the batch running time of the existing solution in the production environment and the Oracle data warehouse based on SmartX hyper-convergence was compared. A total of 3 rounds of batch running tests were conducted. The shorter the time, the better the performance.

Production Environment

Data warehouse runs batch data flow

Extract the data from the DB2 database into the Oracle data warehouse through Informatica (ETL extraction tool), and then run batches in the Oracle data warehouse to complete data processing and data processing.

Production environment data warehouse infrastructure

Data warehouse system component resource configuration

The production environment data warehouse is deployed on the IBM AIX system.

test environment

Hardware topology

Hyperconverged server hardware configuration

Data warehouse system component resource configuration

The verification environment data warehouse is deployed on the RHEL4.8 operating system.

Test Data

Based on the customer's test goals and test scenarios, the batch test data of Oracle data warehouse in hyper-converged architecture and production environment is as shown in the figure below:

Test conclusions and project highlights

  • After multiple rounds of testing and verification, the SmartX hyper-converged architecture has significantly improved performance when running Oracle data warehouse batches compared to the original production architecture of small machines + centralized storage. The batch running time can be effectively shortened by 36%, and has been recognized by customers . .
  • It verified the feasibility of moving the Oracle data warehouse from small machines to x86 servers, and provided a quantitative reference basis for customers to subsequently use SmartX hyper-convergence to replace small machines + centralized storage;
  • Improved resource utilization - Oracle data warehouse batch operations usually run at night, and the hyper-converged architecture can support other applications, databases and other services during the day. In this way, infrastructure hardware resources can be fully reused to maximize resource utilization efficiency.

Extended thinking

The above tests only verified the feasibility and advantages of the hyper-converged architecture to support the batch running performance of the Oracle data warehouse system. In addition, hyper-converged solutions will also bring the following value to enterprises for key business operations:

  • Improving reliability and availability : In response to the most concerning reliability issues of critical business systems, SmartX currently also provides users with many hyper-converged disaster recovery and backup solutions to improve system reliability and availability while improving efficiency and reducing costs.
  • Simplified operation and maintenance : The hyper-converged architecture is based on the software-defined model and general-purpose servers. Compared with the minicomputer solution, it can very effectively reduce the complexity of system operation and maintenance and investment costs.
  • Resource integration : For the vast majority of financial customers, hyper-converged computing virtualization and distributed storage can effectively integrate computing and storage resources of various IT systems, further reducing overall IT complexity and investment costs.
  • Elastic expansion : SmartX hyper-converged architecture has simple and easy-to-operate horizontal expansion capabilities. It can expand capacity and computing resources while also achieving near-linear performance improvements.

Guess you like

Origin blog.csdn.net/weixin_43696211/article/details/130557593