KaiwuDB CTO Wei Kewei: Connecting everything, exploring a new generation of data base

On October 28-29, the 8th China Open Source Annual Conference (COSCon's23) was successfully held in Jingronghui, High-tech Zone, Chengdu, Sichuan. As an annual grand event in the field of open source, this year's theme is "Open Source: Continuous Streams, Mountains and Seas", attracting many industry scholars, technical experts, and open source enthusiasts. With the joint presence of the participants, KaiwuDB CTO Wei Kewei was invited to deliver a keynote speech on "Internet of Everything, Exploring a New Generation of Data Base" at the main forum.   

AI4DB—obtain “knowledge” from data and enhance data vitality

The era of the Internet of Everything has stimulated diverse possibilities for data applications, but at the same time it has also put forward more demands on the subject of data management. In the IoT scenario, the acquisition, exchange and processing of data are the core. As the amount of data increases, the marginal value of data decreases. In order to obtain value from these industrial data, the combination of AI and IoT Combination is particularly important.

In the field of IoT , we communicate< /span> industry may be hindered. Without the support of AI, the development of the entirethe costs brought by the Internet era and the benefits it generates cannot effectively support the healthy development of enterprises,all thingsData itself cannot bring value to the enterprise, but when "knowledge" is obtained from the data through continuous learning, the data becomes vital. More importantly, . The reason is that will add "A" in front to form the familiar AIoTOften

Returning to user orientation, the key behind technology lies inwhether we can provide users with effective solutions. In actual situations, there are many uncontrollable problems in technology that hinder users from using it; the expensive initial investment in infrastructure does not seem to really help users reduce costs.

Therefore, KaiwuDB advocates "reducing complexity into simplicity" based on technology, products and industry needs, and attaches great importance to cultivating "native AI” Capabilities , include: intelligent life cycle management, downgrade Sampling, intelligent pre-calculationand other functions,help enterprises build a complete system of data acquisition, data exchange, data processing and analysis< /span> data value to meet the actual needs of users. link capabilities provide end users with solutions to mine more

  • Intelligent life cycle management

To achieve the demandmatching between storage cost and data value. Take time series data as an example. Storage costs continue to increase over time, so how to manage the data life cycle to It is important.

Among them, compression is a common method, but compression will bring about performanceconsumption, so the needs of new and old data need to be weighed. AI technology is used to compress long-term data to reduce storage space. Recent data can be stored in a larger space, reasonably balancing storage costs and data value.

  • Downsampling

means reducing the frequency of data collection. When facing a scenario of massive data, you can reduce the high-frequency data collectionto the low-frequency rate , toreduce storagestorage costs. SelectTake the square The formula can is Randomly selected or otherwise, the core goal is to reduce the amount of data storage and processing overhead while retaining representation trends and important information as much as possible.

In data management, in order to retain valuable data parts, AI technology can be used to intervene to perform datafeature extraction and information compression. For example, AI can analyze data behavior patterns based on application requirements and help users choose better downsampling strategies to retain representative data.

  • Intelligent precomputation

That is, by analyzing data behavior and query patterns, preparing data in advance and optimizing aggregation operations, thereby improving query performance. Taking the time series data scenario as an example, the data will be aggregated and analyzed in the time dimension. One of the key technologies behind this is intelligent pre-computing—that is, the use of AI The brain predicts what content users will perform aggregate analysis on, and calculates the results"knowledge" in advance.

This capability can quickly realize result feedback and greatly improve performance; in terms of life cycle management, AI can also be used to predict user usage. If AI predicts that users will no longer frequently call a certain type of data, it can Automatically move to cold storage to reduce resource usage.

DB4AI—data is more active and users are less burdened

There are some common algorithms in the IoT field, such as common time series prediction, image recognition, etc. These are topics that we are more concerned about in the AIoT field; at the same time, we also need to solve the problem of the separation of the two major ecosystems of database and AI. , that is, how to realize that the model generated from the data in the database can be used in the database, while avoiding a lot of extra burden on data engineers and data scientists.

To this end,KaiwuDB providesnative predictive analytics capabilities. We hope to provide a platform that can achieve a close integration of databases and algorithms. For example, supports model training, model reasoning and other capabilities on the function calling platform in the database . In this wayFrom the operational level of database developers and managers, they are equivalent to only using a basic capability of our database and will not generate additional burden. On the other hand, we can also open interfaces for data scientists so that they can put the trained models into into the database. Through the above methods, various database related personnel can be closely connected.

In addition, we offer alifecycle management engine capability—Modelops in Database. For example: Our AI model itself is time-sensitive. For example, a model trained with last year's data may not be applicable this year. Who can discover this problem first at this time? It should be a database. Because when we find that the data distribution has changed significantly, we can infer that the performance of the model may also be at risk. This is also the key idea of ​​KaiwuDB in DB For AI.

write at the end

KaiwuDB is a multi-mode database, a very important core is the use ofintegration. A unified interface provides users with the ability to manage and process data. "Large models bring us a very good opportunity to truly implement a completely different multi-model database",Wei Kewei said.

As far as databases are concerned, opensources andinnovation have beenare inseparable. Looking back at the entire database development process, innovation is crucial, and open source is An important way to lead innovation. In the future, KaiwuDB will strive to provide partners with more open and intelligent database solutions. We will also have open source plans in the future, so stay tuned!

Tang Xiaoou, founder of SenseTime, passed away at the age of 55 In 2023, PHP stagnated Wi-Fi 7 will be fully available in early 2024 Debut, 5 times faster than Wi-Fi 6 Hongmeng system is about to become independent, and many universities have set up “Hongmeng classes” Zhihui Jun’s startup company refinances , the amount exceeds 600 million yuan, and the pre-money valuation is 3.5 billion yuan Quark Browser PC version starts internal testing AI code assistant is popular, and programming language rankings are all There's nothing you can do Mate 60 Pro's 5G modem and radio frequency technology are far ahead MariaDB splits SkySQL and is established as an independent company Xiaomi responds to Yu Chengdong’s “keel pivot” plagiarism statement from Huawei
{{o.name}}
{{m.name}}

Supongo que te gusta

Origin my.oschina.net/u/5148943/blog/10140267
Recomendado
Clasificación