【Visual Leak Detector】核心源码剖析（VLD 2.5.1）

说明

使用 VLD 内存泄漏检测工具辅助开发时整理的学习笔记。本篇对 VLD 2.5.1 源码做内存泄漏检测的思路进行剖析。同系列文章目录可见《内存泄漏检测工具》目录

文章目录

1. 源码获取

version 1.0 及之前版本都使用旧的检测思路：通过 _CrtSetAllocHook 注册自定义 AllocHook 函数，从而监视程序的内存分配事件，详见本人另一篇博客核心源码剖析（VLD 1.0），缺陷是只能检测由 new 或 malloc 产生的内存泄漏，受限于 _CrtSetAllocHook。从 version 1.9 开始，VLD 换用了新的检测思路，通过修改导入地址表（Import Address Table）将原先的内存操作函数替换为 VLD 自定义的函数，从而可以检测到更多类型的泄漏。CodeProject-Visual-Leak-Detector 与百度网盘-vld-1.9d-setup 可以下载 vld 1.9d 的库及源码，注意，这个版本的安装器有个坑，会清空之前的 Path 系统变量，只留下 VLD 的，需慎用。Github-dmoulding-vld 上有 vld 1.9h 的源码。Github-KindDragon-vld 上有 vld 2.5.1 的源码，这是目前的最新版本（其他下载途径详见 VLD 2.5.1 源码下载）。

Oh Shit!-图片走丢了-打个广告-欢迎来博客园关注“木三百川”

本篇文章主要对 vld 2.5.1 的源码进行剖析。以下资料可能对理解其检测原理有帮助：

2. 源码文件概览

以下 26 个文件是 VLD 源码的核心文件，

vld-master\src
   callstack.cpp
   callstack.h
   criticalsection.h
   crtmfcpatch.h
   dbghelp.h
   dllspatches.cpp
   loaderlock.h
   map.h
   ntapi.cpp
   ntapi.h
   resource.h
   set.h
   stdafx.cpp
   stdafx.h
   tree.h
   utility.cpp
   utility.h
   vld.cpp
   vld.h
   vldallocator.h
   vldapi.cpp
   vldheap.cpp
   vldheap.h
   vldint.h
   vld_def.h
   vld_hooks.cpp

其中有 17 个 .h 文件、9 个 .cpp 文件，各文件用途简述如下：

以下 5 个文件用于定义 VLD 内部使用的数据结构，set 类似于 STL set，map 类似于 STL map，tree 为红黑树，callstack 类似于 STL vector。
```
callstack.cpp
callstack.h
map.h
set.h
ree.h
```
以下 3 个文件用于定义 VLD 内部使用的内存管理函数，供 VLD 内部使用。
```
vldallocator.h
vldheap.cpp
vldheap.h
```
以下 3 几个文件用于定义 VLD 修正后的内存管理函数，供 VLD 外部使用。进一步跟踪发现，vld_hooks.cpp 里定义的函数在 VLD 内部也会被调用。

扫描二维码关注公众号，回复： 17368392 查看本文章
```
crtmfcpatch.h
dllspatches.cpp
vld_hooks.cpp
```

以下 11 个文件定义了一些通用的函数、变量、宏等。

criticalsection.h
dbghelp.h
loaderlock.h
ntapi.cpp
ntapi.h
resource.h
stdafx.cpp
stdafx.h
utility.cpp
utility.h
vldapi.cpp

以下 2 个文件定义了 VisualLeakDetector 类的方法，外部 API 接口的内部实现多在这里。
```
vld.cpp
vldint.h
```
以下 2 个文件是 VLD 对外的包含文件，里面声明了 VLD 的 API 接口，还有一些配置宏的定义。
```
vld.h
vld_def.h
```

3. 源码剖析

vld 2.5.1 自定义了 vld.dll 的入口点函数，核心代码如下，详见 vld.cpp 第 76~307 行。

#define _DECL_DLLMAIN  // for _CRT_INIT
#include <process.h>   // for _CRT_INIT
#pragma comment(linker, "/entry:DllEntryPoint")

__declspec(noinline)
BOOL WINAPI DllEntryPoint(HINSTANCE hinstDLL, DWORD fdwReason, LPVOID lpReserved)
{
    // Patch/Restore ntdll address that calls the dll entry point
    if (fdwReason == DLL_PROCESS_ATTACH) {
        NtDllPatch((PBYTE)_ReturnAddress(), patch);
    }

    if (fdwReason == DLL_PROCESS_ATTACH || fdwReason == DLL_THREAD_ATTACH)
        if (!_CRT_INIT(hinstDLL, fdwReason, lpReserved))
            return(FALSE);

    if (fdwReason == DLL_PROCESS_DETACH || fdwReason == DLL_THREAD_DETACH)
        if (!_CRT_INIT(hinstDLL, fdwReason, lpReserved))
            return(FALSE);

    if (fdwReason == DLL_PROCESS_DETACH) {
        NtDllRestore(patch);
    }
    return(TRUE);
}

并定义了一个很重要的全局变量，详见 vld.cpp 第 60~61 行：

// The one and only VisualLeakDetector object instance.
__declspec(dllexport) VisualLeakDetector g_vld;

从入口点函数可知：

（1）加载 vld.dll 时，做了两件事：先执行 NtDllPatch() 函数、然后执行 VisualLeakDetector 类构造函数（在 _CRT_INIT() 中）。

（2）卸载 vld.dll 时，也做了两件事：先执行 VisualLeakDetector 类析构函数（在 _CRT_INIT() 中）、然后执行 NtDllRestore() 函数。

3.1 通过 inline hook 修补 LdrpCallInitRoutine

这是加载 vld.dll 时做的第一件事，在 NtDllPatch() 函数中进行。由于每次加载/卸载 DLL 时，都会进入默认的 LdrpCallInitRoutine() 函数，为了对新加载的 DLL 做内存泄漏检测（如果配置项 ForceIncludeModules 列表包含这个 DLL），VLD 在 NtDllPatch() 函数中使用 inline hook 技术修补了默认的 LdrpCallInitRoutine() 函数，核心代码如下，详见 vld.cpp 第 171~258 行。

BOOL NtDllPatch(const PBYTE pReturnAddress, NTDLL_LDR_PATCH &NtDllPatch)
{
    if (NtDllPatch.bState == FALSE) {
        ...
        BYTE ptr[] = { 0xFF, 0x75, 0x08 };                                   // push [ebp][08h]
        BYTE mov[] = { 0x90, 0xB8, '?', '?', '?', '?' };                     // mov eax, 0x00000000
        BYTE call[] = { 0xFF, 0xD0 };                                        // call eax
        ...
        BYTE jmp[] = { 0xE9, '?', '?', '?', '?' };                           // jmp 0x00000000
        ...
        if (...) {
            ...
            if (VirtualProtect(NtDllPatch.pDetourAddress, NtDllPatch.nDetourSize, PAGE_EXECUTE_READWRITE, &dwProtect)) {
                memset(NtDllPatch.pDetourAddress, 0x90, NtDllPatch.nDetourSize);
                ...
                
                // Push EntryPoint as last parameter
                memcpy(&NtDllPatch.pDetourAddress[0], &ptr, _countof(ptr));
                
                // Copy original param instructions
                memcpy(&NtDllPatch.pDetourAddress[_countof(ptr)], NtDllPatch.pPatchAddress, nParamSize);
                
                // Move LdrpCallInitRoutine to eax/rax
                *(PSIZE_T)(&mov[2]) = (SIZE_T)LdrpCallInitRoutine;
                memcpy(&NtDllPatch.pDetourAddress[_countof(ptr) + nParamSize], &mov, _countof(mov));

                // Jump to original function
                *(DWORD*)(&jmp[1]) = (DWORD)(pReturnAddress - _countof(call) - (NtDllPatch.pDetourAddress + NtDllPatch.nDetourSize));
                memcpy(&NtDllPatch.pDetourAddress[_countof(ptr) + nParamSize + _countof(mov)], &jmp, _countof(jmp));

                VirtualProtect(NtDllPatch.pDetourAddress, NtDllPatch.nDetourSize, dwProtect, &dwProtect);

                if (VirtualProtect(NtDllPatch.pPatchAddress, NtDllPatch.nPatchSize, PAGE_EXECUTE_READWRITE, &dwProtect)) {
                    memset(NtDllPatch.pPatchAddress, 0x90, NtDllPatch.nPatchSize);

                    // Jump to detour address
                    *(DWORD*)(&jmp[1]) = (DWORD)(NtDllPatch.pDetourAddress - (pReturnAddress - _countof(call)));
                    memcpy(pReturnAddress - _countof(call) - _countof(jmp), &jmp, _countof(jmp));

                    // Call LdrpCallInitRoutine from eax/rax
                    memcpy(pReturnAddress - _countof(call), &call, _countof(call));

                    VirtualProtect(NtDllPatch.pPatchAddress, NtDllPatch.nPatchSize, dwProtect, &dwProtect);

                    NtDllPatch.bState = TRUE;
                }
            }
        }
    }
    return NtDllPatch.bState;
}

用于修补的 LdrpCallInitRoutine() 函数如下，详见 vld.cpp 第 89~99 行。

typedef BOOLEAN(NTAPI *PDLL_INIT_ROUTINE)(IN PVOID DllHandle, IN ULONG Reason, IN PCONTEXT Context OPTIONAL);
BOOLEAN WINAPI LdrpCallInitRoutine(IN PVOID BaseAddress, IN ULONG Reason, IN PVOID Context, IN PDLL_INIT_ROUTINE EntryPoint)
{
    LoaderLock ll;

    if (Reason == DLL_PROCESS_ATTACH) {
        g_vld.RefreshModules();
    }

    return EntryPoint(BaseAddress, Reason, (PCONTEXT)Context);
}

对默认的 LdrpCallInitRoutine() 函数修补完成后，在程序的后续运行过程中，每次新加载了 DLL 库，都会自动执行 g_vld.RefreshModules()，刷新内存泄漏检测的模块列表。外部 API 接口 VLDRefreshModules() 也是对 g_vld.RefreshModules() 的一个简单封装（详见 vldapi.cpp 第 95~98 行）。这个 g_vld.RefreshModules() 的流程可以简述如下：

（1）使用 dbghelp.h 库 EnumerateLoadedModulesW64 函数获得当前进程的所有已加载模块（DLL、EXE），v2.5.1 使用的 dbghelp.dll 版本为 6.11.1.404。

（2）遍历已加载模块，确保这些模块的符号信息可用，使用到的 dbghelp.h 库函数有：SymGetModuleInfoW64、SymUnloadModule64、SymLoadModuleExW。同时使用 IAT hook 技术替换掉这些模块中的内存操作函数，达到监控所有内存操作的效果。

（3）保存当前所有已加载模块的状态及信息到 g_vld 的 m_loadedModules 变量中，这是一个类似于 STL set 的数据结构，底层实现是红黑树。

3.2 通过 IAT hook 替换内存操作函数

这是加载 vld.dll 时做的第二件事，在 VisualLeakDetector 类构造函数中进行，详见 vld.cpp 第 337~518 行，该构造函数的主干如下。

// Constructor - Initializes private data, loads configuration options, and
//   attaches Visual Leak Detector to all other modules loaded into the current
//   process.
//
VisualLeakDetector::VisualLeakDetector ()
{
    _set_error_mode(_OUT_TO_STDERR);

    // Initialize configuration options and related private data.
    _wcsnset_s(m_forcedModuleList, MAXMODULELISTLENGTH, '\0', _TRUNCATE);
    m_maxDataDump    = 0xffffffff;
    m_maxTraceFrames = 0xffffffff;
    m_options        = 0x0;
    ...

    // Load configuration options.
    configure();
    if (m_options & VLD_OPT_VLDOFF) {
        Report(L"Visual Leak Detector is turned off.\n");
        return;
    }
    ...

    // Initialize global variables.
    g_currentProcess    = GetCurrentProcess();
    g_currentThread     = GetCurrentThread();
    g_processHeap       = GetProcessHeap();
    ...

    // Initialize remaining private data.
    m_heapMap         = new HeapMap;
    m_heapMap->reserve(HEAP_MAP_RESERVE);
    m_iMalloc         = NULL;
    ...

    // Initialize the symbol handler. We use it for obtaining source file/line
    // number information and function names for the memory leak report.
    LPWSTR symbolpath = buildSymbolSearchPath();
    ...
    if (!g_DbgHelp.SymInitializeW(g_currentProcess, symbolpath, FALSE)) {
        Report(L"WARNING: Visual Leak Detector: The symbol handler failed to initialize (error=%lu).\n"
            L"    File and function names will probably not be available in call stacks.\n", GetLastError());
    }
    delete [] symbolpath;
    ...

    // Attach Visual Leak Detector to every module loaded in the process.
    ...
    g_LoadedModules.EnumerateLoadedModulesW64(g_currentProcess, addLoadedModule, newmodules);
    attachToLoadedModules(newmodules);
    ModuleSet* oldmodules = m_loadedModules;
    m_loadedModules = newmodules;
    delete oldmodules;
    ...

    Report(L"Visual Leak Detector Version " VLDVERSION L" installed.\n");
    if (m_status & VLD_STATUS_FORCE_REPORT_TO_FILE) {
        // The report is being forced to a file. Let the human know why.
        Report(L"NOTE: Visual Leak Detector: Unicode-encoded reporting has been enabled, but the\n"
            L"  debugger is the only selected report destination. The debugger cannot display\n"
            L"  Unicode characters, so the report will also be sent to a file. If no file has\n"
            L"  been specified, the default file name is \"" VLD_DEFAULT_REPORT_FILE_NAME L"\".\n");
    }
    reportConfig();
}

重点在上面的第 47~53 行（对应 vld.cpp 第 494~502 行），这几行的流程与 g_vld.RefreshModules() 的流程一样，其中 attachToLoadedModules 的函数主干如下，详见 vld.cpp 第 769~906 行：

VOID VisualLeakDetector::attachToLoadedModules (ModuleSet *newmodules)
{
    ...

    // Iterate through the supplied set, until all modules have been attached.
    for (ModuleSet::Iterator newit = newmodules->begin(); newit != newmodules->end(); ++newit)
    {
        ...
        DWORD64 modulebase = (DWORD64) (*newit).addrLow;
        ...
            
        // increase reference count to module
        HMODULE modulelocal = NULL;
        if (!GetModuleHandleEx(GET_MODULE_HANDLE_EX_FLAG_FROM_ADDRESS, (LPCTSTR) modulebase, &modulelocal))
            continue;
        ...

        // Attach to the module.
        PatchModule(modulelocal, m_patchTable, _countof(m_patchTable));
        ...
    }
}

m_patchTable 里面存储了需要进行 IAT hook 的内存操作函数表，详见 dllspatches.cpp，下面是一个概览。

struct moduleentry_t
{
    LPCSTR          exportModuleName; // The name of the module exporting the patched API.
    BOOL            reportLeaks;      // Patch module to report leaks from it
    UINT_PTR        moduleBase;       // The base address of the exporting module (filled in at runtime when the modules are loaded).
    patchentry_t*   patchTable;
};

moduleentry_t VisualLeakDetector::m_patchTable [] = {
    // Win32 heap APIs.
    "kernel32.dll", FALSE,  0x0, m_kernelbasePatch, // we patch this record on Win7 and higher
    "kernel32.dll", FALSE,  0x0, m_kernel32Patch,

    // MFC new operators (exported by ordinal).
    "mfc42.dll",    TRUE,   0x0, mfc42Patch,
    "mfc42d.dll",   TRUE,   0x0, mfc42dPatch,
    "mfc42u.dll",   TRUE,   0x0, mfc42uPatch,
    "mfc42ud.dll",  TRUE,   0x0, mfc42udPatch,
    ...
    "mfc140.dll",   TRUE,   0x0, mfc140Patch,
    "mfc140d.dll",  TRUE,   0x0, mfc140dPatch,
    "mfc140u.dll",  TRUE,   0x0, mfc140uPatch,
    "mfc140ud.dll", TRUE,   0x0, mfc140udPatch,

    // CRT new operators and heap APIs.
    "msvcrt.dll",   FALSE,  0x0, msvcrtPatch,
    "msvcrtd.dll",  FALSE,  0x0, msvcrtdPatch,
    "msvcr70.dll",  FALSE,  0x0, msvcr70Patch,
    "msvcr70d.dll", FALSE,  0x0, msvcr70dPatch,
    ...
    "msvcr120.dll", FALSE,  0x0, msvcr120Patch,
    "msvcr120d.dll",FALSE,  0x0, msvcr120dPatch,
    "ucrtbase.dll", FALSE,  0x0, ucrtbasePatch,
    "ucrtbased.dll",FALSE,  0x0, ucrtbasedPatch,

    // NT APIs.
    "ntdll.dll",    FALSE,  0x0, m_ntdllPatch,

    // COM heap APIs.
    "ole32.dll",    FALSE,  0x0, m_ole32Patch
};

// This structure allows us to build a table of APIs which should be patched
// through to replacement functions provided by VLD.
struct patchentry_t
{
    LPCSTR  importName;       // The name (or ordinal) of the imported API being patched.
    LPVOID* original;         // Pointer to the original function.
    LPCVOID replacement;      // Pointer to the function to which the imported API should be patched through to.
};

static patchentry_t ucrtbasedPatch[] = {
    "_calloc_dbg",        &UCRTd::data.pcrtd__calloc_dbg,      UCRTd::crtd__calloc_dbg,
    "_malloc_dbg",        &UCRTd::data.pcrtd__malloc_dbg,      UCRTd::crtd__malloc_dbg,
    "_realloc_dbg",       &UCRTd::data.pcrtd__realloc_dbg,     UCRTd::crtd__realloc_dbg,
    ...
};

继续跟踪，其中 PatchModule 的函数主干如下，详见 utility.cpp 第 628~672 行，这个函数对 m_patchTable 中的每个表，都执行 PatchImport：

BOOL PatchModule (HMODULE importmodule, moduleentry_t patchtable [], UINT tablesize)
{
    moduleentry_t *entry;
    UINT          index;
    BOOL          patched = FALSE;
    ...

    // Loop through the import patch table, individually patching each import
    // listed in the table.
    ...
    for (index = 0; index < tablesize; index++) {
        entry = &patchtable[index];
        if (PatchImport(importmodule, entry)) {
            patched = TRUE;
        }
    }

    return patched;
}

继续跟踪，到了 PatchImport 函数，详见 utility.cpp 第 459~626 行，这是 IAT hook 技术的核心函数，正是在这个函数里，通过修改 IAT（导入地址表）将原先的内存操作函数替换为了 VLD 自定义的函数，修改 IAT 的核心代码如下，其中 thunk->u1.Function 是 IAT 表中原函数的地址，replacement 是 VLD 自定义函数的地址，VirtualProtect 用于更改对应内存区域的读写属性。

DWORD protect;
if (VirtualProtect(&thunk->u1.Function, sizeof(thunk->u1.Function), PAGE_EXECUTE_READWRITE, &protect)) {
    thunk->u1.Function = (DWORD_PTR)replacement;
    if (VirtualProtect(&thunk->u1.Function, sizeof(thunk->u1.Function), protect, &protect)) {
        ...
    }
}

除了对当前已加载的模块进行 IAT hook 外，VisualLeakDetector 类构造函数还做了以下工作：

初始化一系列全局的 NT APIs 函数句柄、全局变量、私有变量。
初始化 VLD 的配置信息，调用 configure() 函数与 reportConfig() 函数。
初始化符号搜索路径，调用 buildSymbolSearchPath() 函数。

3.3 每次内存分配时获取调用堆栈信息

以 CRT 中的 new 函数为例，VLD 会将其替换为以下自定义函数，详见 crtmfcpatch.h 第 887~906 行：

// crtd_scalar_new - Calls to the CRT's scalar new operator from msvcrXXd.dll
//   are patched through to this function.
//
//  - size (IN): The size, in bytes, of the memory block to be allocated.
//
//  Return Value:
//
//    Returns the value returned by the CRT scalar new operator.
//
template<int CRTVersion, bool debug>
void* CrtPatch<CRTVersion, debug>::crtd_scalar_new (size_t size)
{
    PRINT_HOOKED_FUNCTION();
    new_t pcrtxxd_scalar_new = (new_t)data.pcrtd_scalar_new;
    assert(pcrtxxd_scalar_new);

    CAPTURE_CONTEXT();
    CaptureContext cc((void*)pcrtxxd_scalar_new, context_, debug, (CRTVersion >= 140));
    return pcrtxxd_scalar_new(size);
}

CAPTURE_CONTEXT() 宏定义如下，用于捕获此次分配的指令地址，为后面获取调用堆栈做准备，详见 utility.h 第 74~97 行。

// Capture current context
#if defined(_M_IX86)
#define CAPTURE_CONTEXT()                                                       \
    context_t context_;                                                         \
    {CONTEXT _ctx;                                                              \
    RtlCaptureContext(&_ctx);                                                   \
    context_.Ebp = _ctx.Ebp; context_.Esp = _ctx.Esp; context_.Eip = _ctx.Eip;  \
    context_.fp = (UINT_PTR)_ReturnAddress();}
#define GET_RETURN_ADDRESS(context)  (context.fp)
#elif defined(_M_X64)
#define CAPTURE_CONTEXT()                                                       \
    context_t context_;                                                         \
    {CONTEXT _ctx;                                                              \
    RtlCaptureContext(&_ctx);                                                   \
    context_.Rbp = _ctx.Rbp; context_.Rsp = _ctx.Rsp; context_.Rip = _ctx.Rip;  \
    context_.fp = (UINT_PTR)_ReturnAddress();}
#define GET_RETURN_ADDRESS(context)  (context.fp)
#else
// If you want to retarget Visual Leak Detector to another processor
// architecture then you'll need to provide an architecture-specific macro to
// obtain the frame pointer (or other address) which can be used to obtain the
// return address and stack pointer of the calling frame.
#error "Visual Leak Detector is not supported on this architecture."
#endif // _M_IX86 || _M_X64

CaptureContext 的构造函数与析构函数如下，详见 vld.cpp 第 2903~2956 行：

CaptureContext::CaptureContext(void* func, context_t& context, BOOL debug, BOOL ucrt) : m_context(context) {
    context.func = reinterpret_cast<UINT_PTR>(func);
    m_tls = g_vld.getTls();

    if (debug) {
        m_tls->flags |= VLD_TLS_DEBUGCRTALLOC;
    }

    if (ucrt) {
        m_tls->flags |= VLD_TLS_UCRT;
    }

    m_bFirst = (GET_RETURN_ADDRESS(m_tls->context) == NULL);
    if (m_bFirst) {
        // This is the first call to enter VLD for the current allocation.
        // Record the current frame pointer.
        m_tls->context = m_context;
    }
}

CaptureContext::~CaptureContext() {
    if (!m_bFirst)
        return;

    if ((m_tls->blockWithoutGuard) && (!IsExcludedModule())) {
        blockinfo_t* pblockInfo = NULL;
        if (m_tls->newBlockWithoutGuard == NULL) {
            g_vld.mapBlock(m_tls->heap,
                m_tls->blockWithoutGuard,
                m_tls->size,
                (m_tls->flags & VLD_TLS_DEBUGCRTALLOC) != 0,
                (m_tls->flags & VLD_TLS_UCRT) != 0,
                m_tls->threadId,
                pblockInfo);
        }
        else {
            g_vld.remapBlock(m_tls->heap,
                m_tls->blockWithoutGuard,
                m_tls->newBlockWithoutGuard,
                m_tls->size,
                (m_tls->flags & VLD_TLS_DEBUGCRTALLOC) != 0,
                (m_tls->flags & VLD_TLS_UCRT) != 0,
                m_tls->threadId,
                pblockInfo, m_tls->context);
        }

        CallStack* callstack = CallStack::Create();
        callstack->getStackTrace(g_vld.m_maxTraceFrames, m_tls->context);
        pblockInfo->callStack.reset(callstack);
    }

    // Reset thread local flags and variables for the next allocation.
    Reset();
}

在 CaptureContext 析构函数里，通过调用 g_vld.mapBlock() 或 g_vld.remapBlock() 将此次分配的信息存入 m_heapMap，这是一个类似于 STL map 的数据结构，底层实现是红黑树，详见 vldint.h，这里面存储了此次分配的线程 ID、分配序号、分配大小、所在堆等信息。

// Data is collected for every block allocated from any heap in the process.
// The data is stored in this structure and these structures are stored in
// a BlockMap which maps each of these structures to its corresponding memory
// block.
struct blockinfo_t {
    std::unique_ptr<CallStack> callStack;
    DWORD      threadId;
    SIZE_T     serialNumber;
    SIZE_T     size;
    bool       reported;
    bool       debugCrtAlloc;
    bool       ucrt;
};

// BlockMaps map memory blocks (via their addresses) to blockinfo_t structures.
typedef Map<LPCVOID, blockinfo_t*> BlockMap;

// Information about each heap in the process is kept in this map. Primarily
// this is used for mapping heaps to all of the blocks allocated from those
// heaps.
struct heapinfo_t {
    BlockMap blockMap;   // Map of all blocks allocated from this heap.
    UINT32   flags;      // Heap status flags
};

// HeapMaps map heaps (via their handles) to BlockMaps.
typedef Map<HANDLE, heapinfo_t*> HeapMap;

class VisualLeakDetector : public IMalloc
{
    ...
private:
    ...
    HeapMap             *m_heapMap; // Map of all active heaps in the process.
    ...
};

此外，CaptureContext 析构函数中还调用 getStackTrace() 获取调用堆栈信息（一系列指令地址），根据用户的不同配置，获取堆栈有两种方法，分别是 fast 模式与 safe 模式（详见配置项 StackWalkMethod）。阅读源码可知，详见 callstack.cpp 第 605~771 行：fast 模式使用 RtlCaptureStackBackTrace 函数来回溯堆栈，快但可能会漏；safe 模式使用 StackWalk64 函数来跟踪堆栈，慢却详细。

VOID FastCallStack::getStackTrace (UINT32 maxdepth, const context_t& context)
{
    ...
    maxframes = RtlCaptureStackBackTrace(0, maxframes, reinterpret_cast<PVOID*>(myFrames), &BackTraceHash);
    ...
}

VOID SafeCallStack::getStackTrace (UINT32 maxdepth, const context_t& context)
{
    ...
    // Walk the stack.
    while (count < maxdepth) {
        count++;
        ...
        if (!g_DbgHelp.StackWalk64(architecture, g_currentProcess, g_currentThread, &frame, &currentContext, NULL,
            SymFunctionTableAccess64, SymGetModuleBase64, NULL, locker)) {
                // Couldn't trace back through any more frames.
                break;
        }
        if (frame.AddrFrame.Offset == 0) {
            // End of stack.
            break;
        }

        // Push this frame's program counter onto the CallStack.
        push_back((UINT_PTR)frame.AddrPC.Offset);
    }
}

3.4 生成泄漏检测报告

与 v1.0 旧版本不同的是，新版本可以在运行过程中调用外部接口 VLDReportLeaks() 或 VLDReportThreadLeaks() 即刻输出泄漏报告，不必等到程序退出时。它们分别是 g_vld.ReportLeaks() 与 g_vld.ReportThreadLeaks() 的简单封装，详见 vldapi.cpp 第 65~73 行。对应的函数代码如下，详见 vld.cpp 第 2394~2434 行。

SIZE_T VisualLeakDetector::ReportLeaks( )
{
    if (m_options & VLD_OPT_VLDOFF) {
        // VLD has been turned off.
        return 0;
    }

    // Generate a memory leak report for each heap in the process.
    SIZE_T leaksCount = 0;
    CriticalSectionLocker<> cs(g_heapMapLock);
    bool firstLeak = true;
    Set<blockinfo_t*> aggregatedLeaks;
    for (HeapMap::Iterator heapit = m_heapMap->begin(); heapit != m_heapMap->end(); ++heapit) {
        HANDLE heap = (*heapit).first;
        UNREFERENCED_PARAMETER(heap);
        heapinfo_t* heapinfo = (*heapit).second;
        leaksCount += reportLeaks(heapinfo, firstLeak, aggregatedLeaks);
    }
    return leaksCount;
}

SIZE_T VisualLeakDetector::ReportThreadLeaks( DWORD threadId )
{
    if (m_options & VLD_OPT_VLDOFF) {
        // VLD has been turned off.
        return 0;
    }

    // Generate a memory leak report for each heap in the process.
    SIZE_T leaksCount = 0;
    CriticalSectionLocker<> cs(g_heapMapLock);
    bool firstLeak = true;
    Set<blockinfo_t*> aggregatedLeaks;
    for (HeapMap::Iterator heapit = m_heapMap->begin(); heapit != m_heapMap->end(); ++heapit) {
        HANDLE heap = (*heapit).first;
        UNREFERENCED_PARAMETER(heap);
        heapinfo_t* heapinfo = (*heapit).second;
        leaksCount += reportLeaks(heapinfo, firstLeak, aggregatedLeaks, threadId);
    }
    return leaksCount;
}

通过上面这段源码可知，输出泄漏报告时，是遍历 m_heapMap 逐堆（heap）进行输出的，两者的差别仅在于调用 reportLeaks() 函数时第四个参数值不同，ReportLeaks() 传的是默认值 threadId = (DWORD)-1 ，而 ReportThreadLeaks() 传的是目标线程的 threadId。继续跟踪，到了 reportLeaks() 函数，核心代码如下，详见 vld.cpp 第 1824~1932 行。

SIZE_T VisualLeakDetector::reportLeaks (heapinfo_t* heapinfo, bool &firstLeak, Set<blockinfo_t*> &aggregatedLeaks, DWORD threadId)
{
    BlockMap* blockmap   = &heapinfo->blockMap;
    SIZE_T leaksFound = 0;

    for (BlockMap::Iterator blockit = blockmap->begin(); blockit != blockmap->end(); ++blockit)
    {
        // Found a block which is still in the BlockMap. We've identified a
        // potential memory leak.
        LPCVOID block = (*blockit).first;
        blockinfo_t* info = (*blockit).second;
        if (info->reported)
            continue;

        if (threadId != ((DWORD)-1) && info->threadId != threadId)
            continue;

        ...

        // It looks like a real memory leak.
        if (firstLeak) { // A confusing way to only display this message once
            Report(L"WARNING: Visual Leak Detector detected memory leaks!\n");
            firstLeak = false;
        }
        SIZE_T blockLeaksCount = 1;
        Report(L"---------- Block %Iu at " ADDRESSFORMAT L": %Iu bytes ----------\n", info->serialNumber, address, size);
		
        ...

        DWORD callstackCRC = 0;
        if (info->callStack)
            callstackCRC = CalculateCRC32(info->size, info->callStack->getHashValue());
        Report(L"  Leak Hash: 0x%08X, Count: %Iu, Total %Iu bytes\n", callstackCRC, blockLeaksCount, size * blockLeaksCount);
        leaksFound += blockLeaksCount;

        // Dump the call stack.
        if (blockLeaksCount == 1)
            Report(L"  Call Stack (TID %u):\n", info->threadId);
        else
            Report(L"  Call Stack:\n");
        if (info->callStack)
            info->callStack->dump(m_options & VLD_OPT_TRACE_INTERNAL_FRAMES);

        // Dump the data in the user data section of the memory block.
        if (m_maxDataDump != 0) {
            Report(L"  Data:\n");
            if (m_options & VLD_OPT_UNICODE_REPORT) {
                DumpMemoryW(address, (m_maxDataDump < size) ? m_maxDataDump : size);
            }
            else {
                DumpMemoryA(address, (m_maxDataDump < size) ? m_maxDataDump : size);
            }
        }
        Report(L"\n\n");
    }

    return leaksFound;
}

在 reportLeaks() 函数里，又对每个堆的 BlockMap 进行了遍历（它也是一个类似于 STL map 的数据结构），这里面存储了在该堆上分配的所有内存块信息，内存块地址为 first key，相应的分配信息结构体为 second value。

（1）Leak Hash 的计算：由以下调用方式及函数定义（详见 utility.cpp 第 1085~1145 行）可知，这个值由泄露块大小及其调用堆栈决定。进一步跟踪表明，这个值还可能与堆栈获取方式（fast 还是 safe）有关，因为不同方式下得到的 startValue 不同（进行 CRC 计算的初值不同）。

DWORD CalculateCRC32(UINT_PTR p, UINT startValue)
{
    register DWORD hash = startValue;
    hash = (hash >> 8) ^ crctab[(hash & 0xff) ^ ((p >>  0) & 0xff)];
    hash = (hash >> 8) ^ crctab[(hash & 0xff) ^ ((p >>  8) & 0xff)];
    hash = (hash >> 8) ^ crctab[(hash & 0xff) ^ ((p >> 16) & 0xff)];
    hash = (hash >> 8) ^ crctab[(hash & 0xff) ^ ((p >> 24) & 0xff)];
#ifdef WIN64
    hash = (hash >> 8) ^ crctab[(hash & 0xff) ^ ((p >> 32) & 0xff)];
    hash = (hash >> 8) ^ crctab[(hash & 0xff) ^ ((p >> 40) & 0xff)];
    hash = (hash >> 8) ^ crctab[(hash & 0xff) ^ ((p >> 48) & 0xff)];
    hash = (hash >> 8) ^ crctab[(hash & 0xff) ^ ((p >> 56) & 0xff)];
#endif
    return hash;
}

callstackCRC = CalculateCRC32(info->size, info->callStack->getHashValue());

（2）Call Stack 的符号化：通过下面这一行调用 dump() 函数。

info->callStack->dump(m_options & VLD_OPT_TRACE_INTERNAL_FRAMES);

在 dump() 函数中，又调用 resolve() 函数对调用堆栈进行解析，将一系列指令地址转换为文件名、函数名、行号等信息，详见 callstack.cpp 第 345~468 行，其核心代码如下。

int CallStack::resolve(BOOL showInternalFrames)
{
    ...

    // Iterate through each frame in the call stack.
    for (UINT32 frame = 0; frame < m_size; frame++)
    {
        // Try to get the source file and line number associated with
        // this program counter address.
        SIZE_T programCounter = (*this)[frame];
        if (GetCallingModule(programCounter) == g_vld.m_vldBase)
            continue;

        DWORD64 displacement64;
        BYTE symbolBuffer[sizeof(SYMBOL_INFO) + MAX_SYMBOL_NAME_SIZE];
        LPCWSTR functionName = getFunctionName(programCounter, displacement64, (SYMBOL_INFO*)&symbolBuffer, locker);

        ...

        BOOL foundline = g_DbgHelp.SymGetLineFromAddrW64(g_currentProcess, programCounter, &displacement, &sourceInfo, locker);

        ...

        if (!foundline)
            displacement = (DWORD)displacement64;
        NumChars = resolveFunction( programCounter, foundline ? &sourceInfo : NULL,
            displacement, functionName, stack_line, _countof( stack_line ));

        ...
    } // end for loop

    m_status |= CALLSTACK_STATUS_NOTSTARTUPCRT;
    return unresolvedFunctionsCount;
}

使用 SymGetLineFromAddrW64 接口获得源文件名和行号，在 getFunctionName() 函数中调用 SymFromAddrW 接口获得函数名，这两点与 v1.0 的做法一致。在 resolveFunction() 中，使用 GetModuleFileName 接口获得模块名，并对堆栈信息字符串进行了格式化。

（3）Data 的格式化显示：通过对 DumpMemoryW() 或 DumpMemoryA() 的调用来将内存中的数据转换为十六进制、ASCII 码或 Unicode 码，详见 utility.cpp 第 48~190 行。DumpMemoryA() 中与编码转换相关的核心代码如下，通过强制类型转换完成 ((PBYTE)address)[byteIndex]，然后根据 isgraph() 函数的返回值来判断是否能显示该字符。

VOID DumpMemoryA (LPCVOID address, SIZE_T size)
{
    // Each line of output is 16 bytes.
    SIZE_T dumpLen;
    if ((size % 16) == 0) {
        // No padding needed.
        dumpLen = size;
    }
    else {
        // We'll need to pad the last line out to 16 bytes.
        dumpLen = size + (16 - (size % 16));
    }

    ...
    WCHAR  ascDump [18] = {0};
    ...
    for (SIZE_T byteIndex = 0; byteIndex < dumpLen; byteIndex++) {
        SIZE_T wordIndex = byteIndex % 16;
        ...
        SIZE_T ascIndex = wordIndex + wordIndex / 8;  
        if (byteIndex < size) {
            BYTE byte = ((PBYTE)address)[byteIndex];
            ...
            if (isgraph(byte)) {
                ascDump[ascIndex] = (WCHAR)byte;
            }
            else {
                ascDump[ascIndex] = L'.';
            }
        }
        ...
    }
}

DumpMemoryW() 中与编码转换相关的核心代码如下，WORD 是 unsigned short 的别名，先通过强制类型转换将内存中的相邻两字节转为一个 WORD，然后直接将其赋值给 WCHAR 数组中的单个元素。

VOID DumpMemoryW (LPCVOID address, SIZE_T size)
{
    // Each line of output is 16 bytes.
    SIZE_T dumpLen;
    if ((size % 16) == 0) {
        // No padding needed.
        dumpLen = size;
    }
    else {
        // We'll need to pad the last line out to 16 bytes.
        dumpLen = size + (16 - (size % 16));
    }

    ...
    WCHAR  unidump [18] = {0};
    ...
    for (SIZE_T byteIndex = 0; byteIndex < dumpLen; byteIndex++) {
        ...
        SIZE_T uniIndex = ((byteIndex / 2) % 8) + ((byteIndex / 2) % 8) / 8; 
        if (byteIndex < size) {
            ...
            if (((byteIndex % 2) == 0) && ((byteIndex + 1) < dumpLen)) {
                // On every even byte, print one character.
                WORD   word = ((PWORD)address)[byteIndex / 2];
                if ((word == 0x0000) || (word == 0x0020)) {
                    unidump[uniIndex] = L'.';
                }
                else {
                    unidump[uniIndex] = word;
                }
            }
        }
        ...
    }
}

（4）输出泄漏检测报告：Report() 函数里（详见 utility.cpp 第 747~774 行），完成字符串的格式化后，又接着调用 Print() 输出泄漏报告（详见 utility.cpp 第 687~745 行），在这里面会尝试调用用户自定义的 ReportHook() 函数，若没有，则 CallReportHook() 默认返回 0。

VOID Print (LPWSTR messagew)
{
    if (NULL == messagew)
        return;

    int hook_retval=0;
    if (!CallReportHook(0, messagew, &hook_retval))
    {
        if (s_reportEncoding == unicode) {
            if (s_reportFile != NULL) {
                // Send the report to the previously specified file.
                fwrite(messagew, sizeof(WCHAR), wcslen(messagew), s_reportFile);
            }

            if ( s_reportToStdOut )
                fputws(messagew, stdout);
        }
        else {
            const size_t MAXMESSAGELENGTH = 5119;
            size_t  count = 0;
            CHAR    messagea [MAXMESSAGELENGTH + 1];
            if (wcstombs_s(&count, messagea, MAXMESSAGELENGTH + 1, messagew, _TRUNCATE) != 0) {
                // Failed to convert the Unicode message to ASCII.
                assert(FALSE);
                return;
            }
            messagea[MAXMESSAGELENGTH] = '\0';

            if (s_reportFile != NULL) {
                // Send the report to the previously specified file.
                fwrite(messagea, sizeof(CHAR), strlen(messagea), s_reportFile);
            }

            if ( s_reportToStdOut )
                fputs(messagea, stdout);
        }

        if (s_reportToDebugger)
            OutputDebugStringW(messagew);
    }
    else if (hook_retval == 1)
        __debugbreak();

    if (s_reportToDebugger && (s_reportDelay)) {
        Sleep(10); // Workaround the Visual Studio 6 bug where debug strings are sometimes lost if they're sent too fast.
    }
}

3.5 程序退出时的工作

卸载 vld.dll 时，做了两件事：先执行 VisualLeakDetector 类析构函数（在 _CRT_INIT() 中）、然后执行 NtDllRestore() 函数。首先看 VisualLeakDetector 类析构函数，详见 vld.cpp 第 610~722 行，其函数主干如下。

VisualLeakDetector::~VisualLeakDetector ()
{
    ...

    if (m_status & VLD_STATUS_INSTALLED) {
        // Detach Visual Leak Detector from all previously attached modules.
        ...
        g_LoadedModules.EnumerateLoadedModulesW64(g_currentProcess, detachFromModule, NULL);
        ...

        BOOL threadsactive = waitForAllVLDThreads();

        if (m_status & VLD_STATUS_NEVER_ENABLED) {
            // Visual Leak Detector started with leak detection disabled and
            // it was never enabled at runtime. A lot of good that does.
            Report(L"WARNING: Visual Leak Detector: Memory leak detection was never enabled.\n");
        }
        else {
            // Generate a memory leak report for each heap in the process.
            SIZE_T leaks_count = ReportLeaks();

            // Show a summary.
            if (leaks_count == 0) {
                Report(L"No memory leaks detected.\n");
            }
            else {
                Report(L"Visual Leak Detector detected %Iu memory leak", leaks_count);
                Report((leaks_count > 1) ? L"s (%Iu bytes).\n" : L" (%Iu bytes).\n", m_curAlloc);
                Report(L"Largest number used: %Iu bytes.\n", m_maxAlloc);
                Report(L"Total allocations: %Iu bytes.\n", m_totalAlloc);
            }
        }

        // Free resources used by the symbol handler.
        DbgTrace(L"dbghelp32.dll %i: SymCleanup\n", GetCurrentThreadId());
        if (!g_DbgHelp.SymCleanup(g_currentProcess)) {
            Report(L"WARNING: Visual Leak Detector: The symbol handler failed to deallocate resources (error=%lu).\n",
                GetLastError());
        }

        ...
        
        if (threadsactive) {
            Report(L"WARNING: Visual Leak Detector: Some threads appear to have not terminated normally.\n"
                L"  This could cause inaccurate leak detection results, including false positives.\n");
        }
        Report(L"Visual Leak Detector is now exiting.\n");

        ...

        checkInternalMemoryLeaks();
    }
    else {
        ...
    }
    ...
}

在析构函数中做了以下几个工作：

（1）还原 IAT 表，将被替换的函数还原。调用堆栈为：EnumerateLoadedModulesW64 -> detachFromModule -> RestoreModule -> RestoreImport，详见 RestoreImport 函数，在 utility.cpp 第 776~895 行，核心代码为 iate->u1.Function = (DWORD_PTR)original。

（2）等待其他线程退出。调用了 waitForAllVLDThreads() 函数，详见 vld.cpp 第 520~565 行，如下所示，当有线程未退出时，程序可能会等待几十秒（不大于 90 秒），这也是有些时候关闭程序但很久未输出报告的原因。

bool VisualLeakDetector::waitForAllVLDThreads()
{
    bool threadsactive = false;
    DWORD dwCurProcessID = GetCurrentProcessId();
    int waitcount = 0;

    // See if any threads that have ever entered VLD's code are still active.
    CriticalSectionLocker<> cs(m_tlsLock);
    for (TlsMap::Iterator tlsit = m_tlsMap->begin(); tlsit != m_tlsMap->end(); ++tlsit) {
        if ((*tlsit).second->threadId == GetCurrentThreadId()) {
            // Don't wait for the current thread to exit.
            continue;
        }

        HANDLE thread = OpenThread(SYNCHRONIZE | THREAD_QUERY_INFORMATION, FALSE, (*tlsit).second->threadId);
        if (thread == NULL) {
            // Couldn't query this thread. We'll assume that it exited.
            continue; // XXX should we check GetLastError()?
        }
        if (GetProcessIdOfThread(thread) != dwCurProcessID) {
            //The thread ID has been recycled.
            CloseHandle(thread);
            continue;
        }
        if (WaitForSingleObject(thread, 10000) == WAIT_TIMEOUT) { // 10 seconds
            // There is still at least one other thread running. The CRT
            // will stomp it dead when it cleans up, which is not a
            // graceful way for a thread to go down. Warn about this,
            // and wait until the thread has exited so that we know it
            // can't still be off running somewhere in VLD's code.
            //
            // Since we've been waiting a while, let the human know we are
            // still here and alive.
            waitcount++;
            threadsactive = true;
            if (waitcount >= 9) // 90 sec.
            {
                CloseHandle(thread);
                return threadsactive;
            }
            Report(L"Visual Leak Detector: Waiting for threads to terminate...\n");
        }
        CloseHandle(thread);
    }
    return threadsactive;
}

**（3）生成泄漏检测报告。**调用了 ReportLeaks() 函数，其实现思路详见本博客上文。

（4）生成泄漏检测总结信息。leaks_count 为本次检测出的全部泄漏块总数，m_curAlloc 为本次检测出的全部泄漏块总大小，m_maxAlloc 为整个检测过程中全部泄漏块总大小的最大值（即 max(m_curAlloc)），m_totalAlloc 为整个检测过程中在堆上所分配内存的总大小。

Report(L"Visual Leak Detector detected %Iu memory leak", leaks_count);
Report((leaks_count > 1) ? L"s (%Iu bytes).\n" : L" (%Iu bytes).\n", m_curAlloc);
Report(L"Largest number used: %Iu bytes.\n", m_maxAlloc);
Report(L"Total allocations: %Iu bytes.\n", m_totalAlloc);

（5）释放资源。释放内部成员变量的内存，使用 SymCleanup 释放符号资源。

**（6）泄漏自检。**调用了 checkInternalMemoryLeaks() 函数，详见 vld.cpp 第 567~608 行。通过遍历一个 VLD 自定义双向链表来判断自身是否产生了内存泄漏，这个双向链表的结构与系统自带的内存管理双向链表相类似，可参考本人另一篇博客核心源码剖析（VLD 1.0）。

析构完毕后，会执行 NtDllRestore() 函数，详见 vld.cpp 第 261~279 行，还原对默认 LdrpCallInitRoutine() 的更改。

BOOL NtDllRestore(NTDLL_LDR_PATCH &NtDllPatch)
{
    // Restore patched bytes
    BOOL bResult = FALSE;
    if (NtDllPatch.bState && NtDllPatch.nPatchSize && &NtDllPatch.pBackup[0]) {
        DWORD dwProtect = 0;
        if (VirtualProtect(NtDllPatch.pPatchAddress, NtDllPatch.nPatchSize, PAGE_EXECUTE_READWRITE, &dwProtect)) {
            memcpy(NtDllPatch.pPatchAddress, NtDllPatch.pBackup, NtDllPatch.nPatchSize);
            VirtualProtect(NtDllPatch.pPatchAddress, NtDllPatch.nPatchSize, dwProtect, &dwProtect);

            if (VirtualProtect(NtDllPatch.pDetourAddress, NtDllPatch.nDetourSize, PAGE_EXECUTE_READWRITE, &dwProtect)) {
                memset(NtDllPatch.pDetourAddress, 0x00, NtDllPatch.nDetourSize);
                VirtualProtect(NtDllPatch.pDetourAddress, NtDllPatch.nDetourSize, dwProtect, &dwProtect);
                bResult = TRUE;
            }
        }
    }
    return bResult;
}

4. 其他问题

4.1 如何区分分配内存的来由

VLD 2.5.1 思路如下：

与核心源码剖析（VLD 1.0）一样，使用 _CrtMemBlockHeader 结构体的 nBlockUse 成员来判断是否属于 CRT 分配的内存，详见 resolveStacks() 函数（vld.cpp 第 2861~2862 行）、getLeaksCount() 函数（vld.cpp 第 1739~1740 行）、reportLeaks() 函数（vld.cpp 第 1854~1855 行）。
通过调用堆栈中的函数名来判断是否属于 CRT 启动代码分配的内存，详见 isCrtStartupFunction() 函数，在 callstack.cpp 第 513~554 行。

VLD 仿照 _CrtMemBlockHeader 结构体自定义了一个 vldblockheader_t，用来存储 VLD 内部的每次分配信息，详见 vldheap.h 第 88~99 行。接着重载了内部的 new/delete 函数（详见 vldheap.cpp）、自定义继承了 std::allocator（详见 vldallocator.h），并为 VLD 开辟了一个专属堆 g_vldHeap。这样一来，VLD 内部每次分配内存时都会分配在专属堆 g_vldHeap 上，且都加上这个自定义头，最终形成了一个存储 VLD 内部内存分配信息的双向链表，让一个全局指针 g_vldBlockList 指向这个链表的头节点，后续通过这个全局指针访问双向链表，即可获得 VLD 内部的内存分配信息。

// Memory block header structure used internally by VLD. All internally
// allocated blocks are allocated from VLD's private heap and have this header
// pretended to them.
struct vldblockheader_t
{
    struct vldblockheader_t *next;          // Pointer to the next block in the list of internally allocated blocks.
    struct vldblockheader_t *prev;          // Pointer to the preceding block in the list of internally allocated blocks.
    const char              *file;          // Name of the file where this block was allocated.
    int                      line;          // Line number within the above file where this block was allocated.
    size_t                   size;          // The size of this memory block, not including this header.
    size_t                   serialNumber;  // Each block is assigned a unique serial number, starting from zero.
};

4.2 如何实现多线程检测

与核心源码剖析（VLD 1.0）一样，v2.5.1 也使用到了线程本地存储（Thread Local Storage），参考 MicroSoft-Using-Thread-Local-Storage。全局对象 g_vld 有两个成员变量 m_tlsIndex 与 m_tlsMap，相关定义可见 vldint.h，如下。

// Thread local storage structure. Every thread in the process gets its own copy
// of this structure. Thread specific information, such as the current leak
// detection status (enabled or disabled) and the address that initiated the
// current allocation is stored here.
struct tls_t {
    context_t	context;       	  // Address of return address at the first call that entered VLD's code for the current allocation.
    UINT32	    flags;            // Thread-local status flags:
#define VLD_TLS_DEBUGCRTALLOC 0x1 //   If set, the current allocation is a CRT allocation.
#define VLD_TLS_DISABLED 0x2 	  //   If set, memory leak detection is disabled for the current thread.
#define VLD_TLS_ENABLED  0x4 	  //   If set, memory leak detection is enabled for the current thread.
#define VLD_TLS_UCRT     0x8      //   If set, the current allocation is a UCRT allocation.
    UINT32	    oldFlags;         // Thread-local status old flags
    DWORD 	    threadId;         // Thread ID of the thread that owns this TLS structure.
    HANDLE      heap;
    LPVOID      blockWithoutGuard; // Store pointer to block.
    LPVOID      newBlockWithoutGuard;
    SIZE_T      size;
};

// The TlsSet allows VLD to keep track of all thread local storage structures
// allocated in the process.
typedef Map<DWORD,tls_t*> TlsMap;

class VisualLeakDetector : public IMalloc
{
    ...
private:
    ...
    DWORD  m_tlsIndex; // Thread-local storage index.
    ...
    TlsMap *m_tlsMap;  // Set of all thread-local storage structures for the process.
    ...
}

m_tlsIndex 用来接收 TlsAlloc() 返回的索引值，初始化成功后（详见 vld.cpp 第 337~518 行），当前进程的任何线程都可以使用这个索引值来存储和访问对应线程本地的值，不同线程间互不影响，访问获得的结果也与其他线程无关，v2.5.1 用它来存储一个 tls_t 结构体指针，这个结构体里与多线程检测控制有关的变量有 flags、oldFlags、threadId 这三个，其余的被当做每次内存操作时的临时变量。

m_tlsIndex        = TlsAlloc();
...
if (m_tlsIndex == TLS_OUT_OF_INDEXES) {
    Report(L"ERROR: Visual Leak Detector could not be installed because thread local"
        L"  storage could not be allocated.");
    return;
}

TlsMap 是一个类似于 STL map 的容器，线程 ID 为 first key，对应的 tls_t* 为 second value，用它来管理每个线程的 tls_t 结构体内存。每次进行内存分配时，都会进入 enabled() 函数（详见 vld.cpp 第 1210~1239 行）与 getTls() 函数（详见 vld.cpp 第 1287~1325 行），这两个函数都在分配行为所属的线程中执行。

BOOL VisualLeakDetector::enabled ()
{
    if (!(m_status & VLD_STATUS_INSTALLED)) {
        // Memory leak detection is not yet enabled because VLD is still
        // initializing.
        return FALSE;
    }

    tls_t* tls = getTls();
    if (!(tls->flags & VLD_TLS_DISABLED) && !(tls->flags & VLD_TLS_ENABLED)) {
        // The enabled/disabled state for the current thread has not been
        // initialized yet. Use the default state.
        if (m_options & VLD_OPT_START_DISABLED) {
            tls->flags |= VLD_TLS_DISABLED;
        }
        else {
            tls->flags |= VLD_TLS_ENABLED;
        }
    }

    return ((tls->flags & VLD_TLS_ENABLED) != 0);
}

tls_t* VisualLeakDetector::getTls ()
{
    // Get the pointer to this thread's thread local storage structure.
    tls_t* tls = (tls_t*)TlsGetValue(m_tlsIndex);
    assert(GetLastError() == ERROR_SUCCESS);

    if (tls == NULL) {
        DWORD threadId = GetCurrentThreadId();

        CriticalSectionLocker<> cs(m_tlsLock);
        TlsMap::Iterator it = m_tlsMap->find(threadId);
        if (it == m_tlsMap->end()) {
            // This thread's thread local storage structure has not been allocated.
            tls = new tls_t;

            // Add this thread's TLS to the TlsSet.
            m_tlsMap->insert(threadId, tls);
        } else {
            // Already had a thread with this ID
            tls = (*it).second;
        }

        ZeroMemory(&tls->context, sizeof(tls->context));
        tls->flags = 0x0;
        tls->oldFlags = 0x0;
        tls->threadId = threadId;
        tls->blockWithoutGuard = NULL;
        TlsSetValue(m_tlsIndex, tls);
    }

    return tls;
}

若是第一次进入，会给当前线程分配一个 tls_t 结构体，并初始化结构体的成员变量。若用户设置了 VLD_OPT_START_DISABLED，则当前线程初始值 tls->flags |= VLD_TLS_DISABLED，表示 VLD 对当前线程关闭，否则 tls->flags |= VLD_TLS_ENABLED，表示 VLD 对当前线程开启。

4.3 如何实现双击输出自动定位到指定行

这个实现起来比较简单，只要保证这一行输出中，前面的字符串形式为 “文件路径(行号)” 就可以。vld 的堆栈输出形式正好符合这个要求，因此可以自动跳转，参考 CppBlog - 如何在 vs 中的 Output 窗口双击定位代码。下面是这个功能的一个演示例：

#include <Windows.h>

int main()
{
    OutputDebugString(L" e:\\Cworkspace\\VSDemo\\testDoubleClick\\testDoubleClick\\main.cpp (3).\n");

    return 0;
}