An article to get "MMKV principle analysis"

foreword

Brothers, this article is a whole MMKV.
Some brothers asked, what is this?
Hey, it is very simple to tell you that you can discard the key-value persistent storage component based on mmap memory mapping of SP.
The same routine, take a look at the text structure of this article (ps: sometimes I am entangled with this kind of component, should I talk about the use first? Or the principle first? Please give me advice) 1. What is MMKV (definition,
advantages)
2 , Use of MMKV
3. Analysis of SharedPreferences (why not use SP)
4. Analysis of MMKV principles (why use MMKV)

Then proceed step by step! ! !

What is MMKV

What is persistent storage

Simply put, persistent storage is to permanently save data on disk or other non-volatile storage media so that it can be reloaded and used after a program restart or device restart.

In mobile application development, persistent storage is very important, because mobile devices are characterized by limited resources and ephemeral nature. Through persistent storage, applications can persist data, even if the application is closed or the device is restarted, the data will not be lost. Common usage scenarios include storing user configuration information, user login information, application state, user-generated content, and more.

common persistent storage

What kind of persistent storage do we come into contact with during development?

  • 1. File storage: use IO transmission to write data into files. Read and write operations can be performed with the help of the FileInputStream and FileOutputStream classes. This method is suitable for storing small amounts of data, such as configuration files and logs.
  • 2. Shared Preferences: Use key-value pairs to store data in the SharedPreferences file. A SharedPreferences file is an XML file that can be shared between applications. This method is often used to store application configuration parameters and user preferences.
  • 3. SQLite database: Use the SQLite database engine to store and manage structured data. SQLite is an embedded relational database that can provide efficient data storage and query functions. Developers can use the SQLiteOpenHelper class provided by Android to create and manage databases.
  • 4. DataStore: DataStore is a persistent data storage solution in the Android Jetpack component library, introduced from Android Jetpack 1.0.0-alpha06 version. It provides a type-safe, coroutine-enabled way to store and read data, and works with LiveData and Flow. DataStore supports two storage formats: Proto DataStore and Preferences DataStore.
  • 5. MMKV: MMKV is implemented based on the underlying mmap file mapping technology, which has fast read and write speeds and low memory usage. MMKV is suitable for storing a large amount of key-value pair data in Android applications.

MMKV definition

1. The emergence of MMKV is the lightweight storage solution of the WeChat team to replace SharedPreferences.
2. It is a lightweight storage solution similar to SharedPreferences.
3. A key-value storage component based on mmap memory mapping

Advantages of MMKV

1. High performance: MMKV uses some technical means, such as mmap file mapping and shared memory for cross-process communication, to achieve more efficient data access operations. The performance of MMKV is dozens of times faster than SharedPreferences, especially when reading and writing large amounts of data.
2. Small storage volume: This is because MMKV uses a more efficient serialization algorithm and stores data in binary files, avoiding the overhead of XML parsing and serialization. Under the same amount of data, the storage volume of MMKV can be reduced by more than 50%.
3. Cross-process sharing: MMKV supports data sharing between multiple processes, which is very useful for applications that need to transfer data between multiple processes. MMKV ensures the consistency and security of reading and writing data across processes through shared memory and file locking mechanisms.
4. API is simple and easy to use: MMKV provides a concise and easy-to-use API, which makes data access more convenient. You can use various data types as key values ​​without cumbersome type conversions. At the same time, MMKV also provides additional functions such as data compression and encryption to facilitate developers to perform more data processing.

Use of MMKV

Step 1: Import MMKV library

First, in your Android project, open the build.gradle file. Add the following code to dependencies:

implementation 'com.tencent:mmkv:1.2.10'

Then, hit the Sync button to add the library to your project. This step ensures that you can use the MMKV library in your code.

Step 2: Initialize MMKV

Add the following code to your application's entry point (usually the Application class), to initialize MMKV:

MMKV.initialize(this)

This call will ensure that MMKV is available for the entire application.

Step 3: Store and read data

Storing and reading data using MMKV is very simple. Here are a few examples of commonly used methods:

//存储数据:
val mmkv = MMKV.defaultMMKV()
mmkv.encode("key", value)
//上述代码将value存储到名为"key"的键中。

//读取数据:
val mmkv = MMKV.defaultMMKV()
val value = mmkv.decodeString("key")
//上述代码将从名为"key"的键中读取存储的值并将其分配给value。
//注意事项:MMKV可以存储各种类型的数据,
//包括String、Int、Float、Double、 ByteArray等。
//您只需要根据需要使用相应的encode和decode方法。

//删除数据:
val mmkv = MMKV.defaultMMKV()
mmkv.remove("key")

Step 4: Customize the MMKV path (optional)

In addition to the default path, you can also specify a custom MMKV storage path during initialization. For example:

val mmkvPath = MMKV.initialize(this, "/sdcard/mymmkv")

The above code will create an MMKV instance under the /sdcard/mymmkv path.

Step 5: Migrate SP data (give you a reason to use it)

//获取SharedPreferences实例:
val sharedPreferences = getSharedPreferences("your_shared_preferences_name", Context.MODE_PRIVATE)

//调用importFromSharedPreferences()进行数据迁移:
val mmkv = MMKV.defaultMMKV()
MMKV.importFromSharedPreferences(sharedPreferences, mmkv)

//可选:删除旧的SharedPreferences
sharedPreferences.edit().clear().apply()

Here, the SharedPreferences instance and the target MMKV instance are passed to the importFromSharedPreferences() method to complete the migration.
In this way, the data in SharedPreferences will be migrated to MMKV. Please note that after the migration is complete, MMKV should be used to read and write data instead of SharedPreferences. This method is simpler and more efficient than the method of traversing key-value pairs.

SharedPreferences analysis (why not use SP)

Why use MMKV, I will tell you authoritatively why: (SP is not good in all aspects)
Let’s compare why SP is not good

The process of SharedPreferences

The process is as shown below:
insert image description here
1. Start App initialization: use IO to read XML files (SP data is stored in XML)
xml: (Storage files are usually located in /data/data/package name/shared_prefs/ directory.)

<xml version='1.0' encoding='utf-8' standalone='yes'>
<map>
    <string name="name">user_name</string>
    <int name="age" value="28"/>
    <boolean name="xiaomeng" value="false"/> 
<map>

2. Add the full amount to the memory map through deserialization.
Core code: SharedPreferencesImpl.java

Map map = null;
StructStat stat = null;
try {
    
    
    stat = Os.stat(mFile.getPath());
    if (mFile.canRead()) {
    
    
        BufferedInputStream str = null;
        try {
    
    
            str = new BufferedInputStream(
                    new FileInputStream(mFile), 16*1024);
            //将xml文件转成map
            map = XmlUtils.readMapXml(str);
        } catch (Exception e) {
    
    
            Log.w(TAG, "Cannot read " + mFile.getAbsolutePath(), e);
        } finally {
    
    
            IoUtils.closeQuietly(str);
        }
    }
} catch (ErrnoException e) {
    
    
    /* ignore */
}

3. After all the above is added to the map,
the source code of Value is obtained through Map.get(key)0 as follows: SharedPreferencesImpl.java

public String getString(String key, @Nullable String defValue) {
    
    
    synchronized (mLock) {
    
    
        //阻塞等待sp将xml读取到内存后再get
        awaitLoadedLocked();
        String v = (String)mMap.get(key);
        //如果value为空返回默认值
        return v != null ? v : defValue;
    }
}

Problems with SharedPreferences

1. The blocking problem of sp.get (ANR will appear)
What is the problem? We know through the above process that it is a time-consuming operation for sp to obtain file data through IO, so it needs to be operated in a sub-thread.
Then when the amount of our data becomes larger, the XML parsing has not been completed and the full amount is added to the Map. Because the child thread is initialized, we will get it at this time, but naturally we can't get it. How can you prevent you from getting SP?
The source code is as follows:

@Nullable
public String getString(String key, @Nullable String defValue) {
    
    
    synchronized (mLock) {
    
    
        //阻塞等待sp将xml读取到内存后再get
        awaitLoadedLocked();
        String v = (String)mMap.get(key);
        //如果value为空返回默认值
        return v != null ? v : defValue;
    }
}

private void awaitLoadedLocked() {
    
    
    ...
    // sp读取完成后会把mLoaded设置为true
    while (!mLoaded) {
    
    
        try {
    
    
            mLock.wait();
        } catch (InterruptedException unused) {
    
    
        }
    }
}

The awaitLoadedLocked() operation will block, and the lock mLock.notifyAll() will be released after the xml file is read; it
is blocked at this time, what if we call it in the main thread? Isn't it just waiting together, that is, blocked. Isn't that just ANR?
2. The problem of full update
What is this problem? It can be seen that each update will update the data in the map from the memory to the file in full.
Brothers, what is the full update? Isn't that equivalent to re-saving every time?
It is still an IO file operation. When the file gets bigger and bigger, the cost of saving it in full becomes bigger and bigger.
3. Both commit and apply submissions will be ANR

  • The commit() method will perform synchronous writing, which must be time-consuming and cannot be called directly on the main thread.

He will also add the writeToFile task to the main thread queue. If it is too large, the full update will be too slow and ANR will occur.

The source code is as follows:

public boolean commit() {
    
    
            // 开始排队写
            SharedPreferencesImpl.this.enqueueDiskWrite(
                mcr, null /* sync write on this thread okay */);
            try {
    
    
                // 等待同步写的结果
                mcr.writtenToDiskLatch.await();
            } catch (InterruptedException e) {
    
    
                return false;
            } finally {
    
    
            }
            notifyListeners(mcr);
            return mcr.writeToDiskResult;
        }
  • Everyone knows that the apply method is written asynchronously, but it may also cause ANR problems. Let's look at the source code of the apply method.
public void apply() {
    
    
            // 先将更新写入内存缓存
            final MemoryCommitResult mcr = commitToMemory();
            // 创建一个awaitCommit的runnable,加入到QueuedWork中
            final Runnable awaitCommit = new Runnable() {
    
    
                    @Override
                    public void run() {
    
    
                        try {
    
    
                            // 等待写入完成
                            mcr.writtenToDiskLatch.await();
                        } catch (InterruptedException ignored) {
    
    
                        }
                    }
                };
            // 将awaitCommit加入到QueuedWork中
            QueuedWork.addFinisher(awaitCommit);
            Runnable postWriteRunnable = new Runnable() {
    
    
                    @Override
                    public void run() {
    
    
                        awaitCommit.run();
                        QueuedWork.removeFinisher(awaitCommit);
                    }
                };
            // 真正执行sp持久化操作,异步执行
            SharedPreferencesImpl.this.enqueueDiskWrite(mcr, postWriteRunnable);
            // 虽然还没写入文件,但是内存缓存已经更新了,而listener通常都持有相同的sharedPreference对象,所以可以使用内存缓存中的数据
            notifyListeners(mcr);
        }

It can be seen that the write operation is indeed performed in the child thread, but why does apply also cause ANR?
Because in some life cycle methods of Activity and Service, the QueuedWork.waitToFinish() method will be called, and this method will wait for all sub-threads to finish writing before proceeding. Sub-threads such as the main thread are prone to ANR problems.

public static void waitToFinish() {
    
    
       Runnable toFinish;
       //等待所有的任务执行完成
       while ((toFinish = sPendingWorkFinishers.poll()) != null) {
    
    
           toFinish.run();
       }
   }

So you can see that there will be: ActivityThread information in the ANR log caused by apply.

at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:202)
at android.app.SharedPreferencesImpl$EditorImpl$1.run(SharedPreferencesImpl.java:364)
at android.app.QueuedWork.waitToFinish(QueuedWork.java:88)
at android.app.ActivityThread.handleStopActivity(ActivityThread.java:3246)
at android.app.ActivityThread.access$1100(ActivityThread.java:141)
at android.app.ActivityThread$H.handleMessage(ActivityThread.java:1239)

Summary: Although the apply method is written in an asynchronous thread, because the life cycle of Activity and Service will wait for the completion of writing of all SharedPreferences, it may cause stuck and ANR problems.

  • But why wait?

Of course you have to wait: For example,
if you call the apply method to save data in the onPause method of the Activity, and the writing operation in the asynchronous thread is not completed, the Activity is destroyed, and this part of the data will be lost.
In order to avoid this situation, the Android system will wait for all sub-thread write operations to complete in the life cycle methods of Activity and Service, and then continue to execute the following code. This ensures data consistency and integrity and avoids data loss or inconsistency issues.
4. Does not support multi-process

MMKV principle analysis

First of all, MMKV inherits the upper layer of SharedPreferences and also operates in the map.
So how did MMKV solve the problem of SharedPreferences above?
1. Read and write method: I/O
2. Data format: xml
3. Write method: full update
Let's understand the principle of MMKV around the three issues of SharedPreferences.

mmap zero copy

The core of MMKV lies in mmap, so its advantage is to borrow the advantages of mmap. (FileChannel is a typical use of zero copy)
First understand some basics:
what is the memory we often talk about?
At work, the memory we talk about is virtual memory: virtual memory is divided into two parts, user space and kernel space.

  • User space is where the user program code runs (the memory where our APP runs)
  • The kernel space is where the kernel code runs, shared by all processes, and a shared space isolated from each other between processes.

1. First, let’s take a look at how traditional IO operates on memory?

  • User space -> kernel space (CPU copy) -> virtual memory (DMA copy: responsible for transferring data to the kernel) -> physical memory (memory mapping, virtual memory and physical memory for mapping)

insert image description here
2. How does mmap operate memory ?

  • First of all, zero copy is just that there is no CPU copy, and there is still DMA copy.
  • User space (directly mapped to) -> virtual memory (DMA copy) -> physical memory

insert image description hereBecause there is no copy of the CPU, the efficiency is much improved.

  • After realizing such a mapping relationship, the process can use pointers to read and write this section of memory, and the system will automatically write back dirty pages to the corresponding file disk, that is, the operation of the file is completed without calling read, write and other system call functions.
  • Equivalent to operating memory is equal to operating files.
  • On the contrary, the modification of this area by the kernel space also directly reflects the user space, so that file sharing between different processes can be realized.

3. How is it mapped to the past? Let's take a look at the function of map first. I won't talk about it if it's too deep (the source code is in the C layer).
The function prototype of mmap:

void *mmap(void *start, size_t length, int prot, int flags, int fd, off_t offset);
  • start: The start address of the mapping area. Just set null.
  • length: length of the mapping area. Pass in the aligned size m_size of the file.
  • prot: The desired memory protection flag, which cannot conflict with the file's open mode. Set readable and writable.
  • flags: specifies the type of mapping object, whether mapping options and mapping pages can be shared. Setting MAP_SHARED means that it can be shared by processes. This is the key to the fact that MMKV can be used across processes.
  • fd: A valid file descriptor. Use the m_fd opened above.
  • off_toffset: The starting point of the content of the mapped object. It's easier to understand from the beginning

MMKV data storage .defalut file

It is also mentioned above that SharedPreferences are stored in xml files.
The MMKV is saved to the specified directory with .defalut.
Let's take a look at what it looks like in .defalut:
.defalut (this is a binary file, each byte represents an 8-bit binary number, which can represent 256 different values ​​​​(from 0 to 255))
We use hexadecimal To open:
insert image description here
this. . . . . what is this?
Let me take you to interpret it: first look at the picture below and compare it with the picture.
insert image description here
First 0E (hexadecimal): the total length is 14 (decimal)
Second 07 (hexadecimal): Key length is 7 (decimal)
and the next 7: 61 62 63 64 65 66 67 (hexadecimal ) : Key is abcdefg
followed by 01 (hexadecimal): the length of Value is 1 (hexadecimal)
counting 1 backward: 01: Value is 1,
then it is <abcdefg, 1>
and so on for the second key Value pairs are <x, 1>

Question 1: So if the length exceeds 255, it means more than one byte?
Answer: It is stored in a variable-length code, which can be variable-length to 1-5 bytes. This is also called the protocol buffers data storage format (it is also a serialization and deserialization data format, which is the same as Json and XML, but smaller). Question 2: It is so
difficult to read, why do you still use it?
The first point must be because it takes up less space and the data is more compact, which is all valid data.
The second point is that he can update incrementally, let's take a look at his incremental update

Incremental update

Incremental updates are simple.
Just write it incrementally. why?
For example:
we want to change that <x, 1>: 01, 01, 78, 01 to <x, 2>: 01, 01, 78, 02,
we only need to add it directly behind.
insert image description here
Then why can it be modified directly by adding it?
Everyone, think about where the data we read is added to?
It's HashMap, big brother, what are the characteristics of HashMap. Duplicate Keys are overwritten. That's relatively new soon! ! ! !
I have to say the author is really smart.
Then some students asked
the first question: What should I do if the file is too large?
Hey: In addition to incremental updates, MMKV also has full updates.
He used full updates to solve this problem. See how to solve it
1. Too many duplicate keys lead to

  • Deduplication: Use the Map to deduplicate and perform a full update.

2. Do need to save more data

  • Expansion: first expand -> then add the full amount to the newly expanded content

MMKV cross process

MMKV is a cross-process key-value store. Its principle is to use Shared Memory mapping to realize cross-process data sharing. The following steps describe the principle of MMKV in detail:

  1. Create shared memory mapping: In process A, when the initialization method of MMKV is called, a shared memory mapping area will be created, which will be mapped into the address space of process A.
  2. Write data into shared memory: In process A, when the put method of MMKV is called, the data to be stored needs to be serialized and written into the shared memory. MMKV uses the B+Tree data structure to organize data, because B+Tree is suitable for efficient data query and modification.
  3. Notify other processes that there is data update: In process A, when the data is written into the shared memory, a notification will be sent to other processes to inform them that there is new data update. The way of interprocess communication can be a semaphore or a read-write lock on the shared memory area.
  4. Shared memory reading data: When process B receives the data update notification from process A, it will map the shared memory area to the address space of process B through shared memory mapping.
  5. Read data from shared memory: In process B, data can be read directly from shared memory. Through the index structure of B+Tree, MMKV can quickly locate data and perform deserialization operations to convert data into a usable format.
  6. Lock mechanism when updating data: In a multi-process environment, multiple processes may try to modify data in the same MMKV instance at the same time. In order to ensure data consistency, MMKV uses a spin lock based on the CAS (Compare and Swap) mechanism to implement data concurrent access control. This ensures that every time data is modified, only one process can successfully write data to shared memory.

Disadvantages of MMKV

Everything is not perfect! !
1. It is stored synchronously, and writing large strings is not as good as SP and Datastore. Why not?
His storage speed is really fast, but everyone pay attention, he is synchronous! ! ! What do we mean by the freeze time of the APP?
It is the stall time of the main thread, so no matter how short the time is, it will block the time of the main thread. But SP and Datastore are written by sub-threads.
Don't be fooled by his time! ! !

2. There is another shortcoming: it is disk writing, brothers. Any idea what could go wrong?
Whoops, that's right. Power outage, unexpected shutdown. So what if the data is not written?
What can I do, I lost it. no way. (Backup caching can be done on the upper layer)
But SP and Datastore are backed up.

Summarize

The summary is to talk about when MMKV is used?
Then let's take a look at which open source tripartite frameworks are used?
Common ones: Xlog, xCrash, and Flutter MMKV Logger
can be seen to be used by the log library. This is also due to the advantages mentioned above, allowing it to read and write frequently.

So we can use MMKV when dealing with this kind of frequent read and write requirements. For example, logs and positioning data need to be uploaded to the server at a certain time.
Also, because MMKV supports cross-process sharing, when using cross-process operations that require persistent configuration data, you must consider MMKV

Guess you like

Origin blog.csdn.net/weixin_45112340/article/details/131798459