Dry goods丨Detailed explanation of DolphinDB memory table of time series database

The memory table is an important part of the DolphinDB database. The memory table can not only be used directly to store data and realize high-speed data reading and writing, but also can cache the intermediate results of the calculation engine to speed up the calculation process. This tutorial mainly introduces the classification and usage scenarios of DolphinDB memory tables, as well as the similarities and differences in data operations and table structure (schema) operations of various memory tables.


1. Memory table category

According to different usage scenarios and functional characteristics, DolphinDB memory tables can be divided into the following four types:

  • Conventional memory table
  • Key-value memory table
  • Flow data table
  • MVCC memory table


1.1 Conventional memory table

Conventional memory table is the most basic table structure in DolphinDB and supports operations such as addition, deletion, modification, and query. The results returned by SQL queries are usually stored in conventional memory tables, waiting for further processing.

  • create

Use the table function to create a conventional memory table. The table function has two usages: the first usage is based on the specified schema (field type and field name) and table capacity (capacity) and the initial number of rows (size) to generate; the second usage is through the existing data (matrix , Table, array and tuple) to generate a table.

The advantage of using the first method is that memory can be allocated for the table in advance. When the number of records in the table exceeds the capacity, the system will automatically expand the capacity of the table. When expanding, the system will first allocate a larger memory space (increase from 20% to 100%), then copy the old table to the new table, and finally release the original memory. For larger tables, the cost of expansion will be higher. Therefore, if we can predict the number of rows in the table in advance, it is recommended to allocate a reasonable capacity in advance when creating the memory table. If the initial number of rows in the table is 0, the system will generate an empty table. If the initial number of rows is not 0, the system will generate a table with the specified number of rows, and the value of each column in the table is the default value. E.g:

//创建一个空的常规内存表
t=table(100:0,`sym`id`val,[SYMBOL,INT,INT])

//创建一个10行的常规内存表
t=table(100:10,`sym`id`val,[SYMBOL,INT,INT])
select * from t

sym id val
--- -- ---
    0  0  
    0  0  
    0  0  
    0  0  
    0  0  
    0  0  
    0  0  
    0  0  
    0  0  
    0  0  

The table function also allows to create a conventional memory table from existing data. The following example is created by multiple arrays.

sym=`A`B`C`D`E
id=5 4 3 2 1
val=52 64 25 48 71
t=table(sym,id,val)
  • application

Conventional memory table is one of the most frequently used data structures in DolphinDB, second only to arrays. The query results of SQL statements and the intermediate results of distributed queries are stored in conventional memory tables. When the system memory is insufficient, the table will not automatically overflow the data to the disk, but Out Of Memory is abnormal. Therefore, when we perform various queries and calculations, we must pay attention to the size of the intermediate results and the final results. When some intermediate results are no longer needed, please release them in time. Regarding the various usages of adding, deleting, modifying and checking conventional memory tables, you can refer to another tutorial on loading and operating memory partition tables .


1.2 Key-value memory table

The key-value memory table is a memory table in DolphinDB that supports primary keys. By specifying one or more fields in the table as the primary key, the records in the table can be uniquely determined. The key-value memory table supports operations such as addition, deletion, modification, and query, but the primary key value is not allowed to be updated. The key-value memory table records the row number corresponding to each key value through a hash table, so it has very high efficiency for key-value-based search and update.

  • create

Use the keyedTable function to create a key-value memory table. This function is very similar to the table function, the only difference is that a parameter is added to indicate the name of the key column.

//创建空的键值内存表,主键由sym和id字段组成
t=keyedTable(`sym`id,1:0,`sym`id`val,[SYMBOL,INT,INT])

//使用向量创建键值内存表,主键由sym和id字段组成
sym=`A`B`C`D`E
id=5 4 3 2 1
val=52 64 25 48 71
t=keyedTable(`sym`id,sym,id,val)
Note: When creating a key-value memory table by specifying the capacity and initial size, the initial size must be 0.

We can also use the keyedTable function to convert conventional memory tables to key-value memory tables. E.g:

sym=`A`B`C`D`E
id=5 4 3 2 1
val=52 64 25 48 71
tmp=table(sym, id, val)
t=keyedTable(`sym`id, tmp)
  • Features of data insertion and update

When adding a new record to the key-value memory table, the system will automatically check the primary key value of the new record. If the primary key value in the new record does not exist in the table, then add a new record to the table; if the primary key value of the new record duplicates the primary key value of the existing record, the corresponding primary key value in the table will be updated record of. Please see the example below.

First, insert a new record into the empty key-value memory table. The primary key values ​​in the new record are AAPL, IBM and GOOG.

t=keyedTable(`sym,1:0,`sym`datetime`price`qty,[SYMBOL,DATETIME,DOUBLE,DOUBLE]);
insert into t values(`APPL`IBM`GOOG,2018.06.08T12:30:00 2018.06.08T12:30:00 2018.06.08T12:30:00,50.3 45.6 58.0,5200 4800 7800);
t;

sym  datetime            price qty 
---- ------------------- ----- ----
APPL 2018.06.08T12:30:00 50.3  5200
IBM  2018.06.08T12:30:00 45.6  4800
GOOG 2018.06.08T12:30:00 58    7800

Insert a batch of new records with primary key values ​​of AAPL, IBM and GOOG into the table again.

insert into t values(`APPL`IBM`GOOG,2018.06.08T12:30:01 2018.06.08T12:30:01 2018.06.08T12:30:01,65.8 45.2 78.6,5800 8700 4600);
t;

sym  datetime            price qty 
---- ------------------- ----- ----
APPL 2018.06.08T12:30:01 65.8  5800
IBM  2018.06.08T12:30:01 45.2  8700
GOOG 2018.06.08T12:30:01 78.6  4600

As you can see, the number of records in the table has not increased, but the records corresponding to the primary key have been updated.

Continue to insert a batch of new records into the table, the new record itself contains the duplicate primary key value MSFT.

As you can see, there is only one record whose primary key value is MSFT in the table.

  • Application scenario

(1) The key-value table has very high efficiency for single row update and query, and it is an ideal choice for data caching. Compared with redis, the key-value memory table in DolphinDB is compatible with all operations of SQL, and can complete more complex calculations other than key-value updates and queries.

(2) As the output table of the time series aggregation engine, the results of the output table are updated in real time. For details, please refer to the tutorial Using DolphinDB to calculate K-line .

 


1.3 Flow data table

The streaming data table, as its name implies, is a memory table designed for streaming data, and a medium for streaming data publishing and subscription. The stream data table has a natural stream table duality. Publishing a message is equivalent to inserting a record into the stream data table, and subscribing to a message is equivalent to pushing the newly arrived data in the stream data table to the client application. . The query and calculation of streaming data can be done through SQL statements.

  • create

Use the streamTable function to create a stream data table. The usage of streamTable is exactly the same as the table function.

//创建空的流数据表
t=streamTable(1:0,`sym`id`val,[SYMBOL,INT,INT])

//使用向量创建流数据表
sym=`A`B`C`D`E
id=5 4 3 2 1
val=52 64 25 48 71
t=streamTable(sym,id,val)

We can also use the streamTable function to convert a conventional memory table into a stream data table. E.g:

sym=`A`B`C`D`E
id=5 4 3 2 1
val=52 64 25 48 71
tmp=table(sym, id, val)
t=streamTable(tmp)

The stream data table also supports the creation of a single key value column, which can be created by the function keyedStreamTable. But different from the design purpose of keyed table, the purpose of keyedstream table is to avoid duplicate messages in high availability scenarios (multiple publishers write at the same time). Usually the key is the ID of the message.

  • Data operation characteristics

Because the stream data has the characteristic that it will not change once it is generated, the stream data table does not support updating and deleting records, but only supports querying and adding records. Streaming data is usually continuous, but memory is limited. In order to solve this contradiction, the streaming data table introduces a persistence mechanism to keep the latest part of the data in the memory, and the older data is persisted on the disk. When the user subscribes to the old data, it is read directly from the disk. To enable persistence, use the function enableTableShareAndPersistence, refer to the streaming data tutorial for details .

  • Application scenario

The shared streaming data table publishes data in streaming computing. The subscriber uses the subscribeTable function to subscribe and consume streaming data.

 

1.4 MVCC memory table

The MVCC memory table stores multiple versions of data. When multiple users read and write the MVCC memory table at the same time, they do not block each other. The data isolation of the MVCC memory table adopts the snapshot isolation model. What the user reads is the data that already exists before he reads it. Even if the data is modified or deleted during the reading process, it will also affect the user who is reading before. No effect. This multi-version approach can support concurrent user access to memory tables. It should be noted that the current implementation of the MVCC memory table is relatively simple, the entire table is locked when updating and deleting data, and the copy-on-write technology is used to copy a copy of the data, so the efficiency of data deletion and update operations is not high. In subsequent versions, we will implement row-level MVCC memory tables.

  • create

Use mvccTable function to create MVCC memory table. E.g:

//创建空的流数据表
t=mvccTable(1:0,`sym`id`val,[SYMBOL,INT,INT])

//使用向量创建流数据表
sym=`A`B`C`D`E
id=5 4 3 2 1
val=52 64 25 48 71
t=mvccTable(sym,id,val)

We can persist the data of the MVCC memory table to disk, just specify the persistent directory and table name when creating it. E.g,

t=mvccTable(1:0,`sym`id`val,[SYMBOL,INT,INT],"/home/user1/DolphinDB/mvcc","test")

After the system restarts, we can load the data from the disk into the memory through the loadMvccTable function.

loadMvccTable("/home/user1/DolphinDB/mvcc","test")

We can also use the mvccTable function to convert conventional memory tables to MVCC memory tables.

sym=`A`B`C`D`E
id=5 4 3 2 1
val=52 64 25 48 71
tmp=table(sym, id, val)
t=mvccTable(tmp)
  • Application scenario

The current MVCC memory table is suitable for scenarios where more reads and less writes are needed and persistence is required. For example, a dynamic configuration system requires persistent configuration items, and configuration items are not frequently changed. They have been newly added and mainly searched, which is very suitable for MVCC tables.

 

2. Shared memory table

The memory table in DolphinDB is only used in the session that created the memory table by default. It does not support concurrent operations of multiple users and multiple sessions, and of course it is not visible to other sessions. If you want to create a memory table that can be used by other users to ensure the safety of multi-user concurrent operations, you must share the memory table. All 4 types of memory tables can be shared. In DolphinDB, we use the share command to share memory tables.

t=table(1..10 as id,rand(100,10) as val)
share t as st
//或者share(t,`st)

The above code shares table t as table st.

Use the undef function to delete the shared table.

undef(`st,SHARED)

2.1 Ensure that it is visible to all sessions

The memory table is only visible in the current session, not in other sessions. After sharing, other sessions can access the memory table by accessing shared variables. For example, we share table t as table st in the current session.

t=table(1..10 as id,rand(100,10) as val)
share t as st

We can access the variable st in other sessions. For example, insert a piece of data into the shared table st.

insert into st values(11,200)
select * from st

id val
-- ---
1  1  
2  53 
3  13 
4  40 
5  61 
6  92 
7  36 
8  33 
9  46 
10 26 
11 200

Switching to the original session, we can find that a record has also been added to table t.

select * from t

id val
-- ---
1  1  
2  53 
3  13 
4  40 
5  61 
6  92 
7  36 
8  33 
9  46 
10 26 
11 200

 

2.2 Ensure thread safety

In the case of multithreading, the data in the memory table is easily destroyed. Sharing provides a protection mechanism to ensure data security, but it also affects system performance.

Conventional memory tables, stream data tables, and MVCC memory tables all support multi-version models, allowing multiple reads and one write. Specifically, reading and writing do not block each other, you can read when writing, and you can write when reading. There is no lock when reading data, multiple threads are allowed to read data at the same time, and snapshot isolation is used when reading data. A lock must be added when writing data, and only one thread is allowed to modify the memory table. Write operations include adding, deleting or updating. Add records are always appended to the end of the memory table, both memory usage and CPU usage are very efficient. Conventional memory tables and MVCC memory tables support updates and deletions, and use copy-on-write technology, which means that a copy of data is first copied (to form a new version), and then deleted and modified on the new version. It can be seen that both the memory and CPU consumption of delete and update operations are relatively high. When delete and update operations are frequent, and read operations are time-consuming (the old version cannot be released quickly), it is easy to cause OOM exceptions.

The key-value memory table needs to maintain an internal index when writing, and it needs to obtain data based on the index when reading. Therefore, the key-value memory table sharing adopts different methods, and both read and write must be locked. Write thread and reader thread, multiple write threads, multiple reader threads are mutually exclusive. Try to avoid time-consuming queries or calculations for key-value memory tables, otherwise other threads will be waiting for a long time.

 

3. Partition memory table

When the amount of data in the memory table is large, we can partition the memory table. After partitioning, a large table consists of multiple subtables (tablets). The large table does not use global locks. The locks are managed independently by each subtable, which can greatly increase the read and write concurrency. DolphinDB supports value partitioning, range partitioning, hash partitioning, and list partitioning for memory tables. It does not support combined partitioning. In DolphinDB, we use the function createPartitionedTable to create a memory partition table.

  • Create partition conventional memory table
t=table(1:0,`id`val,[INT,INT]) 
db=database("",RANGE,0 101 201 301) 
pt=db.createPartitionedTable(t,`pt,`id)
  • Create partition key memory table
kt=keyedTable(1:0,`id`val,[INT,INT]) 
db=database("",RANGE,0 101 201 301) 
pkt=db.createPartitionedTable(t,`pkt,`id)
  • Create a partitioned stream data table

When creating a partitioned flow data table, you need to pass in multiple flow data tables as templates, and each flow data table corresponds to a partition. When writing data, directly write to these flow tables; when querying data, you need to query the partition table.

st1=streamTable(1:0,`id`val,[INT,INT]) 
st2=streamTable(1:0,`id`val,[INT,INT]) 
st3=streamTable(1:0,`id`val,[INT,INT]) 
db=database("",RANGE,1 101 201 301) pst=db.createPartitionedTable([st1,st2,st3],`pst,`id)  
st1.append!(table(1..100 as id,rand(100,100) as val)) 
st2.append!(table(101..200 as id,rand(100,100) as val)) 
st3.append!(table(201..300 as id,rand(100,100) as val))  
select * from pst
  • Create partition MVCC memory table

Like creating a partitioned stream data table, to create a partitioned MVCC memory table, you need to pass in multiple MVCC memory tables as templates. Each table corresponds to a partition. When writing data, write directly to these tables; when querying data, you need to query the partition table.

mt1=mvccTable(1:0,`id`val,[INT,INT])
mt2=mvccTable(1:0,`id`val,[INT,INT])
mt3=mvccTable(1:0,`id`val,[INT,INT])
db=database("",RANGE,1 101 201 301)
pmt=db.createPartitionedTable([mt1,mt2,mt3],`pst,`id)

mt1.append!(table(1..100 as id,rand(100,100) as val))
mt2.append!(table(101..200 as id,rand(100,100) as val))
mt3.append!(table(201..300 as id,rand(100,100) as val))

select * from pmt

Since the partition memory table does not use global locks, sub-tables cannot be dynamically added or deleted after creation.


3.1 Increase the concurrency of queries

There are three meanings to increase the concurrency of query by partition table: (1) The key-value table also needs to be locked when querying. The partition table is independently managed by the child table, which is equivalent to narrowing the granularity of the lock, so it can increase the concurrency of reading. (2) The partition table can process each sub-table in parallel during batch calculation; (3) If the partition field is specified in the filtering of the SQL query, the partition range can be reduced to avoid the full table scan.

Taking the key-value memory table as an example, we compare the performance of concurrent queries with and without partitioning. First, create a simulation data set, which contains a total of 5 million rows of data.

n=5000000
id=shuffle(1..n)
qty=rand(1000,n)
price=rand(1000.0,n)
kt=keyedTable(`id,id,qty,price)
share kt as skt

id_range=cutPoints(1..n,20)
db=database("",RANGE,id_range)
pkt=db.createPartitionedTable(kt,`pkt,`id).append!(kt)
share pkt as spkt

We simulated 10 clients on another server to simultaneously query the key-value memory table. Each client queries 100,000 times, and each time a piece of data is queried, the total time consumed for each client query 100,000 times is calculated.

def queryKeyedTable(tableName,id){
	for(i in id){
		select * from objByName(tableName) where id=i
	}
}
conn=xdb("192.168.1.135",18102,"admin","123456")
n=5000000

jobid1=array(STRING,0)
for(i in 1..10){
	rid=rand(1..n,100000)
	s=conn(submitJob,"evalQueryUnPartitionTimer"+string(i),"",evalTimer,queryKeyedTable{`skt,rid})
	jobid1.append!(s)
}
time1=array(DOUBLE,0)
for(j in jobid1){
	time1.append!(conn(getJobReturn,j,true))
}

jobid2=array(STRING,0)
for(i in 1..10){
	rid=rand(1..n,100000)
	s=conn(submitJob,"evalQueryPartitionTimer"+string(i),"",evalTimer,queryKeyedTable{`spkt,rid})
	jobid2.append!(s)
}
time2=array(DOUBLE,0)
for(j in jobid2){
	time2.append!(conn(getJobReturn,j,true))
}

time1 is the time taken for 10 clients to query the non-partitioned key memory table, and time2 is the time taken for 10 clients to query the partitioned key memory table, in milliseconds.

time1
[6719.266848,7160.349678,7271.465094,7346.452625,7371.821485,7363.87979,7357.024299,7332.747157,7298.920972,7255.876976]

time2
[2382.154581,2456.586709,2560.380315,2577.602019,2599.724927,2611.944367,2590.131679,2587.706832,2564.305815,2498.027042]

It can be seen that the time consumed for each client to query the partitioned key memory table is lower than the time consumed for querying the unpartitioned memory table.

Query unpartitioned memory tables to ensure snapshot isolation. But querying a partition memory table no longer guarantees snapshot isolation. As mentioned above, the read and write of the partitioned memory table does not use a global lock. When a thread is querying, another thread may be writing and involving multiple sub-tables, so that part of the written data may be read.


3.2 Increase the concurrency of writing

Taking the partitioned conventional memory table as an example, we can write data to different partitions at the same time.

t=table(1:0,`id`val,[INT,INT])
db=database("",RANGE,1 101 201 301)
pt=db.createPartitionedTable(t,`pt,`id)

def writeData(mutable t,id,batchSize,n){
	for(i in 1..n){
		idv=take(id,batchSize)
		valv=rand(100,batchSize)
		tmp=table(idv,valv)
		t.append!(tmp)
	}
}

job1=submitJob("write1","",writeData,pt,1..100,1000,1000)
job2=submitJob("write2","",writeData,pt,101..200,1000,1000)
job3=submitJob("write3","",writeData,pt,201..300,1000,1000)

In the above code, there are 3 threads simultaneously writing to 3 different partitions of pt. It should be noted that we should avoid writing to the same partition at the same time. For example, the following code may cause the system to crash.

job1=submitJob("write1","",writeData,pt,1..300,1000,1000)
job2=submitJob("write2","",writeData,pt,1..300,1000,1000)

The above code defines two write threads, and write to the same partition, which will destroy the memory. In order to ensure the security and consistency of the data in each partition, we can share the partition memory table. In this way, multiple threads can be defined to divide into the same partition at the same time.

share pt as spt
job1=submitJob("write1","",writeData,spt,1..300,1000,1000)
job2=submitJob("write2","",writeData,spt,1..300,1000,1000)


4. Data operation comparison


4.1 Add, delete, modify

The following table summarizes the addition, deletion, modification, and check operations supported by the 4 types of memory tables in the case of shared/partitioned.

Description:

  • Conventional memory tables, key-value memory tables, and MVCC memory tables all support addition, deletion, modification, and query operations. Streaming data tables only support adding data and querying, but not deleting and updating operations.
  • For key-value memory tables, if the primary key is included in the query filter condition, the query performance will be significantly improved.
  • For partitioned memory tables, if the query filter conditions include partition columns, the system can narrow the range of partitions to be scanned, thereby improving query performance.


4.2 Concurrency

Without writing, all memory tables allow multiple threads to query at the same time. In the case of writing, the concurrency of the four memory tables is different. The following table summarizes the concurrent read and write conditions supported by the 4 types of memory tables in the shared/partitioned case.

Description:

  • The shared table allows concurrent reads and writes.
  • For partition tables that are not shared, multiple threads are not allowed to write to the same partition at the same time.


4.3 Endurance

  • Conventional memory tables and key-value memory tables do not support data persistence. Once the node restarts, all data in the memory will be lost.
  • Only empty streaming data tables support data persistence. To persist the flow data table, first configure the persistenceDir directory for the flow data, and then use enableTableShareAndPersistence to share the flow data table and persist it to disk. For example, the streaming data table t is shared and persisted to disk.
t=streamTable(1:0,`id`val,[INT,INT])
enableTableShareAndPersistence(t,`st)

After the flow data table is enabled for persistence, some of the latest records in the flow data table will still be kept in memory. By default, the memory will retain the latest 100,000 records. We can also adjust this value as needed.

Stream data table persistence can be set to adopt asynchronous/synchronous, compressed/uncompressed methods. Normally, asynchronous mode can achieve higher throughput.

After the system restarts, execute the enableTableShareAndPersistence function again to load all the data in the disk into the memory.

  • MVCC memory table supports persistence. When creating the MVCC memory table, we can specify the path of persistence. For example, create a persistent MVCC memory table.
t=mvccTable(1:0,`id`val,[INT,INT],"/home/user/DolphinDB/mvccTable")
t.append!(table(1..10 as id,rand(100,10) as val))

After the system restarts, we can use the loadMvccTable function to load the data from the disk into the memory. E.g:

t=loadMvccTable("/home/user/DolphinDB/mvccTable","t")

5. Table structure operation comparison

The structure operations of the memory table include adding columns, deleting columns, modifying columns (content and data type), and adjusting the order of columns. The following table summarizes the structure operations supported by the 4 types of memory tables in the case of shared/partitioned.

Description:

  • Partition tables and MVCC memory tables cannot add columns through the addColumn function.
  • The partition table can add columns through the update statement, but the stream data table is not allowed to be modified, so the stream data table cannot add columns through the update statement.


6. Summary

DolphinDB supports 4 types of memory tables, and also introduces the concept of sharing and partitioning, which can basically meet the various needs of memory computing and stream computing.

Guess you like

Origin blog.csdn.net/qq_41996852/article/details/112858160