Financial Case: Unified Query Solution Helps Data Governance and Analysis Applications Be More Efficient and Secure

With the growth of enterprise data scale and the diversified development of business, real-time and multi-dimensional flexible query of massive data has become a common business requirement. At the same time, multiple database systems have become the norm, which not only brings about the complexity of data management, but also increases the difficulty of data use. In the face of increasingly complex data environments and strict data security requirements, it is necessary to solve the problem of coexistence of multiple database systems, data Problems such as serious silos, confusing permission management, and difficulty in data query and extraction. At the same time, enterprises have increasingly higher requirements for data security control. Different roles and departments need to access different data. How to finely control database permissions and prevent data leaks and misoperations has become a key challenge.

construction background

Compared with daily queries, indexing, partitioning and other technologies are used to optimize and improve query efficiency during system implementation. There are many ways to generate Ad-Hoc queries . A common way is to map the DIM table and Fact table in the data warehouse to the semantic layer. Analysts use the semantic layer to select tables to establish associations between tables and finally generate SQL statements. Ad-Hoc capabilities are analysis These queries are temporarily produced by engineers during use. The system cannot optimize these queries in advance. The location of ad hoc queries is in the EDW. In a data warehouse system, the more Ad-Hoc queries are used, the higher the requirements for the data warehouse.

Taking banks and consumer finance companies as examples, they have business inquiry and unified Ad-Hoc needs, and their departments cover multiple business areas such as product research and development, operations promotion, post-loan management, and legal compliance. As multiple types of database instances coexist, aspects such as user system integration, data rights management, operational security auditing, and SQL query syntax become very complicated. There is an urgent need for a platform that can access multiple databases and unify data exports.

Case scenario

With the growth of business volume, in order to improve business processing efficiency and risk management level, the business is mainly divided into three departments: risk management department, technology department, and financial market department. Among them, risk management personnel can skillfully use SQL, but they need to target different When business changes SQL syntax, syntax incompatibility often occurs. Financial market personnel rarely use SQL and need to visually drag and drop to pull detailed data. Developers from the Ministry of Science and Technology often accidentally delete and modify core data when operating databases, and write some dangerous SQL that puts pressure on the business database. DDL permissions and code inspections need to be set for users to ensure data security.

The department administrator uniformly configures the user's account information and the JDBC link information of the data source in the platform, and configures data permissions and code verification rules according to business needs . After completing the basic configuration, the risk manager can block the downstream computing engine in the platform. Using a common SQL syntax to retrieve numbers, financial market personnel need to first configure commonly used data models according to the business, and then perform visual retrieval and excel data sorting. Developers can only execute DML statements on the database under the assigned permissions. If there is SQL that exceeds permissions and needs to be approved, it can only be executed after the approval is passed.

Pain point analysis

1. Account security risks exist

Database accounts are often shared by multiple operators. The permissions of operation and maintenance personnel may exceed those required for actual work. It is easy for operations to exceed their authority, posing a threat to data security.

2. There are syntax differences in different databases

Different types of databases have their own unique SQL syntax and rules. When developers write SQL query statements, they need to adapt to different databases, which increases the development workload and may lead to potential errors and performance problems.

3. Problems encountered in multi-dimensional business queries

Drilling, scrolling, slicing, dicing, and row-column transformation are common in business queries. During the business query process, when querying a single PV or querying business flow in a retail scenario, data reading requires hundreds of gigabytes; query memory OOM is always insufficient. ; Nightly scheduling and data synchronization, product managers and analysts kill task scripts during working hours.

4. Insufficient audit trails and difficult tracing

Traditional tools cannot record user and behavioral data. If there are abnormal data operations, the source cannot be traced and the blame cannot be determined, which may lead to secondary infringement of data and pose challenges to the long-term healthy and stable operation of the platform.

5. Business personnel use SQL less frequently

Many report and analyst personnel are not familiar with SQL statements. Data exploration is required when adding new reports or changing fixed report fields. In the past, extraction operations required IT personnel to write SQL queries , resulting in long data query cycles and low efficiency in business decision support. .

6. Business data is scattered in multiple systems

In complex fusion analysis scenarios, it is difficult for data analysts to import data stored in local Excel files into the system and associate it with business database data, making it impossible to analyze business data in a timely and flexible manner.

construction plan

1. Unified query engine

Supports adaptation to multiple database syntaxes, automatically converts query statements, and provides an IDE editing interface including syntax highlighting, keyword prompts, formatting and other functions , so that users do not need to care about the syntax differences of the underlying database, and can complete data connection and data in one stop Processing, data analysis and other full-process functions.

2. Unified identity authentication management

Before using the platform, the administrator needs to maintain the user's personal account information and corresponding permissions. Only after logging in to the unified query platform can the database be accessed, and the real account password of the database cannot be accessed.

3. AI-assisted data query

The corresponding SQL commands can be translated based on the data query results expressed in natural language , including SQL generation, rewriting, error correction, etc., helping query personnel to easily complete complex data extraction and analysis work.

4. Lower the threshold for data analysis

Business personnel can use drag-and-drop operations to perform functions such as data extraction, model configuration, filter configuration, and visual report configuration . They can intuitively select data sources, define query conditions, and combine data without having to deeply understand the underlying database structure and SQL syntax to improve data. Decision support capabilities.

5. Database security permission management and control

Configure security measures such as data desensitization and row-level permissions , set permissions according to different roles and responsibilities to ensure the privacy and security of data in the business database, and conduct real-time audits of user behaviors such as permission changes and dangerous SQL to ensure data compliance use.

file

construction income

在某消费金融公司的项目落地中,使不同部门的人员能够迅速获取所需信息,显著提升了业务处理效率,满足了客户资料查询、交易记录检索、风险信息评估等多种场景的数据分析。同时确保了金融信息的安全性和保密性,为后续平台的稳健运行提供有力保障。

1. The data demand cycle is greatly shortened

The traditional data demand process requires approval and coordination by multiple departments, and is scheduled and manually extracted by IT staff. The unified query platform uses SQL/self-service data retrieval to allow data analysts and business personnel to self-service data retrieval on demand on a unified platform , greatly reducing communication, development, and testing costs, and shortening the data retrieval cycle from the original 3-5 days to minutes. Level out the number.

2、数据驱动决策门槛显著降低

借助自助取数及SQL收藏等功能,使得一线业务及运营管理人员也能便捷访问多源异构数据资产,推动全员数据驱动决策,自平台上线以来,业务侧数据分析参与度显著提高,登录人次、停留时长、SQL收藏量、查询任务数、数据导出次数等指标均远超项目规划预期。

3. Unified traceability of data risk operations

With the help of the audit log function of the platform , the data operation behaviors of various docking databases are fully covered, and 100% access to the complete recording rate of operation behaviors is achieved. It not only enhances data security management and compliance, but also locates and investigates data risk operations. The time is significantly shortened from hours to minutes, and the audit response efficiency is increased by more than 80%.

4. Data access rights are visible and manageable

The unified query platform uses row/column-level permission control and data desensitization technology to realize visualization and refined management of data access permissions, improving data security management and control capabilities. The effective coverage rate of data table permission configuration has been increased to more than 95%, effectively preventing illegal data access; realizing automatic identification and desensitization of customer information, channel data, etc., effectively ensuring the security and compliance of enterprises in data sharing and application. Regularity.

file

"Industry Indicator System White Paper" download address: https://www.dtstack.com/resources/1057?src=szsm

"Dutstack Product White Paper" download address: https://www.dtstack.com/resources/1004?src=szsm

"Data Governance Industry Practice White Paper" download address: https://www.dtstack.com/resources/1001?src=szsm

For those who want to know or consult more about big data products, industry solutions, and customer cases, visit the Kangaroo Cloud official website: https://www.dtstack.com/?src=szkyzg

I decided to give up on open source Hongmeng. Wang Chenglu, the father of open source Hongmeng: Open source Hongmeng is the only architectural innovation industrial software event in the field of basic software in China - OGG 1.0 is released, Huawei contributes all source code Google Reader is killed by the "code shit mountain" Fedora Linux 40 is officially released Former Microsoft developer: Windows 11 performance is "ridiculously bad" Ma Huateng and Zhou Hongyi shake hands to "eliminate grudges" Well-known game companies have issued new regulations: employee wedding gifts must not exceed 100,000 yuan Ubuntu 24.04 LTS officially released Pinduoduo was sentenced for unfair competition Compensation of 5 million yuan
{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/3869098/blog/11059248