Analysis on the principle of Tencent’s data persistence solution MMKV

When it comes to data persistence storage solutions, Android provides many methods. SharedPreference (SP for short) is commonly used in projects. However, although SP is simple to use, it has flaws:

  • The writing speed is slow, especially when the main thread frequently performs writing operations, which may cause lag or ANR;
  • Cross-process is not supported

Therefore, in response to this shortcoming, we often use other technical solutions. For example, if we cannot access data across processes, we use SQLite for data storage and provide data to the outside world through Provider. However, this solution still has the problem of slow response speed, which is very difficult. ANR may occur. Even if the data is accessed in a sub-thread, there will still be synchronization problems. Until the emergence of MMKV, it seems that the above two problems have been solved at once.

So at the beginning of the article, we use a small demo to verify the data storage efficiency of SharedPreference and MMKV to see the specific effect.

object LocalStorageUtil {

    private const val TAG = "LocalStorageUtil"

    fun testSP(context: Context) {

        val sp = context.getSharedPreferences("spfile", Context.MODE_PRIVATE)
        //记录时间
        val currentTime = System.currentTimeMillis()
        for (index in 0..1000) {
            sp.edit().putInt("$index", index).apply()
        }
        Log.d(TAG, "testSP: cost ${System.currentTimeMillis() - currentTime}")
    }

    fun testMMKV(){
        val mmkv = MMKV.defaultMMKV()
        //记录时间
        val currentTime = System.currentTimeMillis()
        for (index in 0..1000) {
            mmkv.putInt("$index", index).apply()
        }
        Log.d(TAG, "testMMKV: cost ${System.currentTimeMillis() - currentTime}")
    }
}

Take a look at the time taken:

D/LocalStorageUtil: testSP: cost 182
D/LocalStorageUtil: testMMKV: cost 15

We see that the efficiency of storing data through MMKV is 10 times that of SP, and this is only 1,000 consecutive storage times. As the amount of data becomes larger and larger, the advantages of MMKV become more obvious, so let's first analyze The source code of SharedPreference is helpful for understanding the MMKV source code.

1 SharedPreference source code analysis

/**
 * Retrieve and hold the contents of the preferences file 'name', returning
 * a SharedPreferences through which you can retrieve and modify its
 * values.  Only one instance of the SharedPreferences object is returned
 * to any callers for the same name, meaning they will see each other's
 * edits as soon as they are made.
 *
 * <p>This method is thread-safe.
 *
 * <p>If the preferences directory does not already exist, it will be created when this method
 * is called.
 *
 * <p>If a preferences file by this name does not exist, it will be created when you retrieve an
 * editor ({@link SharedPreferences#edit()}) and then commit changes ({@link
 * SharedPreferences.Editor#commit()} or {@link SharedPreferences.Editor#apply()}).
 *
 * @param name Desired preferences file.
 * @param mode Operating mode.
 *
 * @return The single {@link SharedPreferences} instance that can be used
 *         to retrieve and modify the preference values.
 *
 * @see #MODE_PRIVATE
 */
public abstract SharedPreferences getSharedPreferences(String name, @PreferencesMode int mode);

First of all, before we use SP, we will first obtain the SharedPreference instance by calling the getSharedPreferences method. The final return value is the SharedPreferences interface instance, and the specific implementation class is SharedPreferencesImpl.

1.1 SharedPreferencesImpl class analysis

When first obtaining the SharedPreferences instance through Context, a file name will be passed in

ContextImpl # getSharedPreferences

@Override
public SharedPreferences getSharedPreferences(String name, int mode) {
    // At least one application in the world actually passes in a null
    // name.  This happened to work because when we generated the file name
    // we would stringify it to "null.xml".  Nice.
    if (mPackageInfo.getApplicationInfo().targetSdkVersion <
            Build.VERSION_CODES.KITKAT) {
        if (name == null) {
            name = "null";
        }
    }

    File file;
    synchronized (ContextImpl.class) {
        if (mSharedPrefsPaths == null) {
            mSharedPrefsPaths = new ArrayMap<>();
        }
        file = mSharedPrefsPaths.get(name);
        if (file == null) {
            file = getSharedPreferencesPath(name);
            mSharedPrefsPaths.put(name, file);
        }
    }
    return getSharedPreferences(file, mode);
}

After passing in the file name, it will check whether this file has been created in mSharedPrefsPaths. We can see that mSharedPrefsPaths is a Map, which completes the mapping between the file name and the specific file. If this file does not exist, then a file will be created, that is, the getSharedPreferencesPath method is called, and then stored in the mSharedPrefsPaths Map collection.

@Override
public File getSharedPreferencesPath(String name) {
    return makeFilename(getPreferencesDir(), name + ".xml");
}

Finally, another getSharedPreferences overloaded method is called. In this method, the created .xml file is obtained to build the SharedPreferencesImpl class.

public SharedPreferences getSharedPreferences(File file, int mode) {
    SharedPreferencesImpl sp;
    synchronized (ContextImpl.class) {
        final ArrayMap<File, SharedPreferencesImpl> cache = getSharedPreferencesCacheLocked();
        sp = cache.get(file);
        if (sp == null) {
            checkMode(mode);
            if (getApplicationInfo().targetSdkVersion >= android.os.Build.VERSION_CODES.O) {
                if (isCredentialProtectedStorage()
                        && !getSystemService(UserManager.class)
                                .isUserUnlockingOrUnlocked(UserHandle.myUserId())) {
                    throw new IllegalStateException("SharedPreferences in credential encrypted "
                            + "storage are not available until after user is unlocked");
                }
            }
            sp = new SharedPreferencesImpl(file, mode);
            cache.put(file, sp);
            return sp;
        }
    }
    if ((mode & Context.MODE_MULTI_PROCESS) != 0 ||
        getApplicationInfo().targetSdkVersion < android.os.Build.VERSION_CODES.HONEYCOMB) {
        // If somebody else (some other process) changed the prefs
        // file behind our back, we reload it.  This has been the
        // historical (if undocumented) behavior.
        sp.startReloadIfChangedUnexpectedly();
    }
    return sp;
}

Constructor of SharedPreferencesImpl

SharedPreferencesImpl(File file, int mode) {
    mFile = file;
    mBackupFile = makeBackupFile(file);
    mMode = mode;
    mLoaded = false;
    mMap = null;
    mThrowable = null;
    startLoadFromDisk();
}

As can be seen from the constructor in SharedPreferencesImpl, startLoadFromDisk is called to read files from the disk every time SharedPreferencesImpl is created. Let's take a look at the specific implementation.

private void startLoadFromDisk() {
    synchronized (mLock) {
        mLoaded = false;
    }
    new Thread("SharedPreferencesImpl-load") {
        public void run() {
            loadFromDisk();
        }
    }.start();
}

From the source code, we can see that a thread named SharedPreferencesImpl-load is opened to fetch files from the disk, and it is through new Thread. If the SharedPreferencesImpl object is created multiple times, multiple threads will be created. It will waste system resources.

SharedPreferencesImpl # loadFromDisk

private void loadFromDisk() {
    // ......
    
    // Debugging
    if (mFile.exists() && !mFile.canRead()) {
        Log.w(TAG, "Attempt to read preferences file " + mFile + " without permission");
    }

    Map<String, Object> map = null;
    StructStat stat = null;
    Throwable thrown = null;
    try {
        stat = Os.stat(mFile.getPath());
        if (mFile.canRead()) {
            BufferedInputStream str = null;
            try {
                str = new BufferedInputStream(
                        new FileInputStream(mFile), 16 * 1024);
                map = (Map<String, Object>) XmlUtils.readMapXml(str);
            } catch (Exception e) {
                Log.w(TAG, "Cannot read " + mFile.getAbsolutePath(), e);
            } finally {
                IoUtils.closeQuietly(str);
            }
        }
    } catch (ErrnoException e) {
        // An errno exception means the stat failed. Treat as empty/non-existing by
        // ignoring.
    } catch (Throwable t) {
        thrown = t;
    }
   
    synchronized (mLock) {
        mLoaded = true;
        
    // ...... 

}

In this method, the data will be read from the file through BufferedInputStream (IO) and converted into a Map data structure. In fact, we can also know by looking at the data format in the file that it is actually key-value data. structure.

<int name="801" value="801" />
<int name="802" value="802" />
<int name="803" value="803" />
<int name="804" value="804" />
<int name="805" value="805" />
<int name="806" value="806" />
<int name="807" value="807" />
<int name="808" value="808" />
<int name="809" value="809" />
<int name="1000" value="1000" />

Then the initialization task is completed. There is a synchronization issue that needs to be paid attention to here, that is, loading disk data is asynchronous, so there is a flag mLoaded, which will be set to false when calling startLoadFromDisk. It will wait until the disk data is loaded. will be set to true.

So here we need to pay attention to a few time-consuming points:

  • When loading data from the disk, the full amount of data will be loaded in. For example, if there are 10_000 pieces of data before, all of them will be read out, so IO reading will be time-consuming;
  • After the data reading is completed, it will also take time to parse the XML dom node.

1.2 SharedPreference reading and writing analysis

We have introduced the initialization process before, and the next step is the read and write operations. First, let’s look at the write operations;

sp.edit().putInt("$index", index).apply()

Judging from the example at the beginning of the article, the Editor object is first obtained through SharedPreference. In fact, the Editor object is obtained from SharedPreferenceImpl, and the corresponding implementation class is EditorImpl.

SharedPreferenceImpl # EditorImpl

public final class EditorImpl implements Editor {
    private final Object mEditorLock = new Object();

    @GuardedBy("mEditorLock")
    private final Map<String, Object> mModified = new HashMap<>();

    @GuardedBy("mEditorLock")
    private boolean mClear = false;

    // ......
    
    @Override
    public Editor putInt(String key, int value) {
        synchronized (mEditorLock) {
            mModified.put(key, value);
            return this;
        }
    }
    // ......
}

When the putInt method is called, it will be stored in the HashMap, and then the apply or commit method can be called to write it to the file, but there is a difference between the two.

EditorImpl # apply

@Override
public void apply() {
    final long startTime = System.currentTimeMillis();

    final MemoryCommitResult mcr = commitToMemory();
    final Runnable awaitCommit = new Runnable() {
            @Override
            public void run() {
                try {
                    mcr.writtenToDiskLatch.await();
                } catch (InterruptedException ignored) {
                }

                if (DEBUG && mcr.wasWritten) {
                    Log.d(TAG, mFile.getName() + ":" + mcr.memoryStateGeneration
                            + " applied after " + (System.currentTimeMillis() - startTime)
                            + " ms");
                }
            }
        };

    QueuedWork.addFinisher(awaitCommit);

    Runnable postWriteRunnable = new Runnable() {
            @Override
            public void run() {
                awaitCommit.run();
                QueuedWork.removeFinisher(awaitCommit);
            }
        };

    SharedPreferencesImpl.this.enqueueDiskWrite(mcr, postWriteRunnable);

    // Okay to notify the listeners before it's hit disk
    // because the listeners should always get the same
    // SharedPreferences instance back, which has the
    // changes reflected in memory.
    notifyListeners(mcr);
}

Through the source code, we see that the method of writing to disk when calling apply is asynchronous. When calling the enqueueDiskWrite method, a Runnable object is passed in. At this time, the main thread will not be blocked, but there is no result of whether the writing is successful.

EditorImpl # commit

public boolean commit() {
    long startTime = 0;

    if (DEBUG) {
        startTime = System.currentTimeMillis();
    }

    MemoryCommitResult mcr = commitToMemory();

    SharedPreferencesImpl.this.enqueueDiskWrite(
        mcr, null /* sync write on this thread okay */);
    try {
        mcr.writtenToDiskLatch.await();
    } catch (InterruptedException e) {
        return false;
    } finally {
        if (DEBUG) {
            Log.d(TAG, mFile.getName() + ":" + mcr.memoryStateGeneration
                    + " committed after " + (System.currentTimeMillis() - startTime)
                    + " ms");
        }
    }
    notifyListeners(mcr);
    return mcr.writeToDiskResult;
}

The commit method writes data directly to the disk. At this time, the thread will be blocked until the data writing is completed, and a result of success or failure of the writing will be returned. Therefore, I believe partners should be able to distinguish the specific scenarios in which the two are called. Come out.

Because the read and write operations of SharedPreference are still completed through traditional IO methods, this is a time-consuming point. Traditional read and write operations involve communication between the application layer and the Kernel.

The application layer only initiates instructions to read data, and the real read and write operations are in the kernel space. The traditional IO storage is copied twice, which is also a relatively time-consuming operation. If it is replaced by zero-copy technology, then it is An excellent optimization strategy, MMKV does this, so those who are familiar with Binder communication and mmap may understand it, while those who are not familiar with it will understand the principles through this article.

2 mmap principle and use

We mentioned earlier that when optimizing traditional IO storage, we do not want to realize file reading and writing through the scheduling of user space and kernel space context, so we will think that mmap can realize zero-copy reading and writing of files, which is definitely more efficient than traditional disks. IO needs to be fast, so first let's take a look at how to use the mmap function, which may involve knowledge of C++ and JNI.

2.1 Use of mmap

First define a method writeBymmap, which reads and writes files by calling the mmap function in the native layer.

class NativeLib {

    /**
     * A native method that is implemented by the 'nativelib' native library,
     * which is packaged with this application.
     */
    external fun stringFromJNI(): String
    
    external fun writeBymmap(fileName:String)



    companion object {
        // Used to load the 'nativelib' library on application startup.
        init {
            System.loadLibrary("nativelib")
        }
    }
}

For the parameter definition of mmap function, we need to understand its meaning.

void* mmap(void* __addr, size_t __size, int __prot, int __flags, int __fd, off_t __offset);
  • _addr: Points to the starting address of the memory to be mapped. It is generally set to null and is determined by the system. After the mapping is successful, this memory address will be returned;
  • _size: Map the length of the file into the memory space;
  • _port: Memory protection flag, generally the following four methods -> PROT_EXEC The mapped area can be executed PROT_READ The mapped area can be read PROT_WRITE The mapped area can be written PROT_NONE The mapped area cannot be accessed;
  • _flags: Whether this mapping area can be shared by other processes. If it is private, then only the current process can map it; if it is shared, then other processes can also obtain this mapped memory;
  • _fd: The file descriptor to be mapped to the memory can be obtained through the open function. After the storage is completed, close needs to be called;
  • _offset: The offset of the file mapping, generally set to 0.
extern "C"
JNIEXPORT void JNICALL
Java_com_lay_nativelib_NativeLib_writeBymmap(JNIEnv *env, jobject thiz, jstring file_name) {

    std::string file = env->GetStringUTFChars(file_name, nullptr);
    //获取文件描述符
    int fd = open(file.c_str(), O_RDWR | O_CREAT, S_IRWXU);
    //设置文件大小
    ftruncate(fd, 4 * 1024);
    //调用mmap函数,返回的是物理映射的虚拟内存地址
    int8_t *ptr = static_cast<int8_t *>(mmap(0, 4 * 1024, PROT_READ | PROT_WRITE, MAP_SHARED, fd,
                                             0));

    //要写入文件的内容
    std::string data("这里是要写入文件的内容");
    //用户空间可以操作这个虚拟内存地址 
    memcpy(ptr, data.data(), data.size());
}

By calling the mmap function, you can get the virtual address of the physical memory mapped by the disk, see the figure below:

In the kernel space, there is a physical memory area mapped with the disk space, and in the user space, you can get the virtual memory address of this physical memory, that is, by calling the mmap function; then if you want to perform a subsequent write operation, you only need to By operating virtual memory in user space, data can be written to disk without the need for context scheduling between user space and kernel space, thus improving efficiency.

After testing, the writeBymmap method of NativeLib() was called and the data was written in the file.

fun testMmap(fileName: String) {

    //记录时间
    val currentTime = System.currentTimeMillis()
    for (index in 0..1000) {
        NativeLib().writeBymmap(fileName)
    }
    Log.d(TAG, "testMmap: cost ${System.currentTimeMillis() - currentTime}")
}

We can calculate it this way, and the final result is:

D/LocalStorageUtil: testSP: cost 166
D/LocalStorageUtil: testMmap: cost 16

We see that the efficiency is basically the same as MMKV, but the mmap file writing method we customized earlier has flaws: if we only want to write 1 byte of data, but will eventually write 4k of data, it will waste memory. .

2.2 Reading and writing data across processes

For the SharedPreference storage method, it cannot support cross-process reading and writing of data. It can only be stored in a single process. If you want to achieve cross-process data access, it is actually very simple. See the figure below:

Because the disk file is stored in the SD card of the mobile phone, other processes can also obtain it from the disk by reading the file, but this cannot avoid switching from kernel mode to user mode . Therefore, as shown in the figure above, process A writes to the disk After data, process B can also copy a copy of the data to the local through the virtual memory address, thereby completing cross-process data reading.

extern "C"
JNIEXPORT jstring JNICALL
Java_com_lay_nativelib_NativeLib_getDataFromDisk(JNIEnv *env, jobject thiz, jstring file_name) {
    std::string file = env->GetStringUTFChars(file_name, nullptr);
    //获取文件描述符
    int fd = open(file.c_str(), O_RDWR | O_CREAT, S_IRWXU);
    //设置文件大小
    ftruncate(fd, 4 * 1024);
    //调用mmap函数,返回的是物理映射的虚拟内存地址
    int8_t *ptr = static_cast<int8_t *>(mmap(0, 4 * 1024, PROT_READ | PROT_WRITE, MAP_SHARED, fd,
                                             0));
    //需要一块buffer存储数据
    char *buffer = static_cast<char *>(malloc(100));
    //将物理内存拷贝到buffer
    memcpy(buffer, ptr, 100);
    //取消映射
    munmap(ptr, 4 * 1024);
    close(fd);
    //char 转 jstring
    return env->NewStringUTF(buffer);
}

The specific call is:

NativeLib().getDataFromDisk("/data/data/com.tal.pad.appmarket/files/NewTextFile.txt").also {
    Log.d("MainActivity", "getDataFromDisk: $it")
}

D/MainActivity: getDataFromDisk: Here is the content to be written to the file

At this point, after obtaining the virtual memory address of the physical memory map through mmap, only one copy (memcpy) is needed to read and write the file, and supports cross-process access, which is also the core principle of MMKV.

The above picture is a picture copied from the official website. It shows the writing efficiency using SharedPreference and MMKV. In fact, why MMKV can improve the writing efficiency dozens of times is because of the memory mapping of mmap to avoid the kernel Switching between state and user state, thereby breaking through the traditional IO bottleneck (secondary copy). Starting from the next article, we will hand-write a set of MMKV framework with our partners to have a deeper understanding of MMKV and mmap.

Android study notes

Android performance optimization article: Android Framework underlying principles article: Android vehicle article: Android reverse security study notes: Android audio and video article: Jetpack family bucket article (including Compose): OkHttp source code analysis notes: Kotlin article: Gradle article: Flutter article: Eight knowledge bodies of Android: Android core notes: Android interview questions from previous years: The latest Android interview questions in 2023: Android vehicle development position interview exercises: Audio and video interview questions:https://qr18.cn/FVlo89
https://qr18.cn/AQpN4J
https://qr18.cn/F05ZCM
https://qr18.cn/CQ5TcL
https://qr18.cn/Ei3VPD
https://qr18.cn/A0gajp
https://qr18.cn/Cw0pBD
https://qr18.cn/CdjtAF
https://qr18.cn/DzrmMB
https://qr18.cn/DIvKma
https://qr18.cn/CyxarU
https://qr21.cn/CaZQLo
https://qr18.cn/CKV8OZ
https://qr18.cn/CgxrRy
https://qr18.cn/FTlyCJ
https://qr18.cn/AcV6Ap

Guess you like

Origin blog.csdn.net/weixin_61845324/article/details/132986684