Analysis and solution to the problem that ASAN cannot find the symbolizer

AddressSanitizer (ASAN for short) has always been a very convenient tool for detecting and analyzing C/C++ memory problems. The WebRTC project integrates ASAN, and ASAN can be turned on or off for the entire project by configuring a simple option, specifically the is_asanoption. is_asanThe default value of the option is falsethat writing the line in args.gnthe file is_asan = truecan turn on ASAN for the entire project, args.gnwriting the line in the file is_asan = falseor not configuring is_asanthe option can turn off ASAN for the entire project.

The Linux debug build of the OpenRTCClient project has ASAN enabled. If all options are configured properly and a C/C++ application is executed, when a memory problem occurs, ASAN will call symbolizer to convert the memory address of the stack related to the memory problem (such as the memory allocation stack and the memory release memory stack) into a file. Line number and symbol name. We can configure environment variables ASAN_SYMBOLIZER_PATHto point to the llvm symbolizer of our choice, eg export ASAN_SYMBOLIZER_PATH=/usr/bin/llvm-symbolizer-11, to tell ASAN what tool to use when it needs to symbolize memory addresses. When the environment variable is not configured ASAN_SYMBOLIZER_PATH, ASAN will try to find an executable file PATHnamed under each path of the environment variable . llvm-symbolizerIf there is no configuration ASAN_SYMBOLIZER_PATHpointing to the appropriate llvm symbolr, and the executable file PATHnamed cannot be found in each path of the environment variable llvm-symbolizer, ASAN can only simply spit out the memory address.

A memory address symbolization failed

The sample application in the OpenRTCClientloop_connect project is compiled and the environment variables are configured before execution. During the execution ASAN_SYMBOLIZER_PATHprocess loop_connect, when a memory problem occurs, the memory address is still not successfully symbolized. The ASAN output is as follows:

=================================================================
==51148==ERROR: AddressSanitizer: heap-use-after-free on address 0x61200014eb40 at pc 0x5639128a0a85 bp 0x7ffcfdbb6b30 sp 0x7ffcfdbb6b28
READ of size 8 at 0x61200014eb40 thread T0
==51148==WARNING: invalid path to external symbolizer!
==51148==WARNING: Failed to use and restart external symbolizer!
    #0 0x5639128a0a84  (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x32fda84) (BuildId: 542ad276a9f6ad54)
    #1 0x563915cdc29d  (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x673929d) (BuildId: 542ad276a9f6ad54)
    #2 0x563910cd2bc1  (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x172fbc1) (BuildId: 542ad276a9f6ad54)
    #3 0x563910cd2c08  (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x172fc08) (BuildId: 542ad276a9f6ad54)
    #4 0x563910cd52f6  (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x17322f6) (BuildId: 542ad276a9f6ad54)
    #5 0x563910cd3b40  (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x1730b40) (BuildId: 542ad276a9f6ad54)
    #6 0x563910ccf40d  (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x172c40d) (BuildId: 542ad276a9f6ad54)
    #7 0x563910ccbad9  (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x1728ad9) (BuildId: 542ad276a9f6ad54)
    #8 0x7efd969cc0b2  (/lib/x86_64-linux-gnu/libc.so.6+0x240b2) (BuildId: 9fdb74e7b217d06c93172a8243f8547f947ee6d1)

0x61200014eb40 is located 0 bytes inside of 320-byte region [0x61200014eb40,0x61200014ec80)
freed by thread T0 here:
    #0 0x563910ca3887  (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x1700887) (BuildId: 542ad276a9f6ad54)
    #1 0x5639122c1791  (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x2d1e791) (BuildId: 542ad276a9f6ad54)
    #2 0x563910cbbc76  (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x1718c76) (BuildId: 542ad276a9f6ad54)
    #3 0x563910cbbb1f  (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x1718b1f) (BuildId: 542ad276a9f6ad54)
    #4 0x563910cbdbfa  (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x171abfa) (BuildId: 542ad276a9f6ad54)
    #5 0x563910cb74c0  (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x17144c0) (BuildId: 542ad276a9f6ad54)
    #6 0x563910cb1384  (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x170e384) (BuildId: 542ad276a9f6ad54)
    #7 0x563910ccd4c4  (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x172a4c4) (BuildId: 542ad276a9f6ad54)
    #8 0x563910ccd42c  (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x172a42c) (BuildId: 542ad276a9f6ad54)
    #9 0x563910ccd105  (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x172a105) (BuildId: 542ad276a9f6ad54)
    #10 0x563910cbc8ee  (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x17198ee) (BuildId: 542ad276a9f6ad54)
    #11 0x563910cbc6e5  (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x17196e5) (BuildId: 542ad276a9f6ad54)
    #12 0x563910ccd858  (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x172a858) (BuildId: 542ad276a9f6ad54)
    #13 0x563910ccbc84  (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x1728c84) (BuildId: 542ad276a9f6ad54)
    #14 0x563910ccad26  (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x1727d26) (BuildId: 542ad276a9f6ad54)

ASAN prompts that the obtained llvm symbolizer address is invalid and the memory address symbolization failed.

Implementation of ASAN

AddressSanitizer is compiler-rtpart of the subproject of the LLVM project. Download the code of the llvm-project project from GitHub . compiler-rtThe code is located llvm-project/compiler-rtin the directory. Generally speaking, we need to build LLVM/Clang to build compiler-rt. We can compiler-rtbuild it together with llvm and clang, but we can also build it separately.

To compiler-rtbuild together with llvm and clang, add compiler-rtto the options passed to cmake -DLLVM_ENABLE_RUNTIMES=.

To build separately, first build LLVM separately to obtain llvm-configthe binary executable, then run the following command:

$ cd llvm-project
$ git checkout -t origin/release/14.x
$ mkdir build-compiler-rt
$ cd build-compiler-rt
$ cmake ../compiler-rt -DLLVM_CONFIG_PATH=/path/to/llvm-config
$ make

( The llvm in the WebRTC code base on which the OpenRTCClient project is based has been updated to llvm-14, so here we also switch to the llvm-14 branch to build.)

The binary library files generated by compilation are mainly located in llvm-project/build-compiler-rt/lib/linux/, such as:

llvm-project/build-compiler-rt$ ls lib/linux/
clang_rt.crtbegin-x86_64.o                    libclang_rt.hwasan_aliases-x86_64.so       libclang_rt.scudo-x86_64.a
clang_rt.crtend-x86_64.o                      libclang_rt.hwasan_cxx-x86_64.a            libclang_rt.scudo-x86_64.so
libclang_rt.asan_cxx-x86_64.a                 libclang_rt.hwasan_cxx-x86_64.a.syms       libclang_rt.tsan_cxx-x86_64.a
libclang_rt.asan_cxx-x86_64.a.syms            libclang_rt.hwasan-x86_64.a                libclang_rt.tsan_cxx-x86_64.a.syms
libclang_rt.asan-preinit-x86_64.a             libclang_rt.hwasan-x86_64.a.syms           libclang_rt.tsan-x86_64.a
libclang_rt.asan_static-x86_64.a              libclang_rt.hwasan-x86_64.so               libclang_rt.tsan-x86_64.a.syms
libclang_rt.asan-x86_64.a                     libclang_rt.lsan-x86_64.a                  libclang_rt.tsan-x86_64.so
libclang_rt.asan-x86_64.a.syms                libclang_rt.msan_cxx-x86_64.a              libclang_rt.ubsan_minimal-x86_64.a
libclang_rt.asan-x86_64.so                    libclang_rt.msan_cxx-x86_64.a.syms         libclang_rt.ubsan_minimal-x86_64.a.syms
libclang_rt.builtins-x86_64.a                 libclang_rt.msan-x86_64.a                  libclang_rt.ubsan_minimal-x86_64.so
libclang_rt.cfi_diag-x86_64.a                 libclang_rt.msan-x86_64.a.syms             libclang_rt.ubsan_standalone_cxx-x86_64.a
libclang_rt.cfi-x86_64.a                      libclang_rt.orc-x86_64.a                   libclang_rt.ubsan_standalone_cxx-x86_64.a.syms
libclang_rt.dd-x86_64.a                       libclang_rt.profile-x86_64.a               libclang_rt.ubsan_standalone-x86_64.a
libclang_rt.dfsan-x86_64.a                    libclang_rt.safestack-x86_64.a             libclang_rt.ubsan_standalone-x86_64.a.syms
libclang_rt.dfsan-x86_64.a.syms               libclang_rt.scudo_cxx_minimal-x86_64.a     libclang_rt.ubsan_standalone-x86_64.so
libclang_rt.dyndd-x86_64.so                   libclang_rt.scudo_cxx-x86_64.a             libclang_rt.xray-basic-x86_64.a
libclang_rt.gwp_asan-x86_64.a                 libclang_rt.scudo_minimal-x86_64.a         libclang_rt.xray-fdr-x86_64.a
libclang_rt.hwasan_aliases_cxx-x86_64.a       libclang_rt.scudo_minimal-x86_64.so        libclang_rt.xray-profiling-x86_64.a
libclang_rt.hwasan_aliases_cxx-x86_64.a.syms  libclang_rt.scudo_standalone_cxx-x86_64.a  libclang_rt.xray-x86_64.a
libclang_rt.hwasan_aliases-x86_64.a           libclang_rt.scudo_standalone-x86_64.a
libclang_rt.hwasan_aliases-x86_64.a.syms      libclang_rt.scudo_standalone-x86_64.so

Turning on AddressSanitizer at the compiler/linker level adds special parameters to the compiler and linker -fsanitize=address, such as linking the OpenRTCClient example application loop_connect. The actual command executed is as follows:

python3 "../../../../webrtc/build/toolchain/gcc_link_wrapper.py" --output="./loop_connect" -- ../../../../build_system/llvm-build/linux/linux/Release+Asserts/bin/clang++ -fuse-ld=lld -Wl,--fatal-warnings -Wl,--build-id -fPIC -Wl,-z,noexecstack -Wl,-z,relro -Wl,-z,now -Wl,--color-diagnostics -Wl,--no-call-graph-profile-sort -m64 -no-canonical-prefixes -Wl,--gdb-index -rdynamic --sysroot=../../../../build_system/sysroot/linux/debian_sid_amd64-sysroot -fsanitize=address -pie -Wl,--disable-new-dtags -Wl,-u_sanitizer_options_link_helper -fsanitize=address -o "./loop_connect" -Wl,--start-group @"./loop_connect.rsp"  -Wl,--end-group  -lX11 -lXcomposite -lXext -lXrender -latomic -ldl -lpthread -lrt -lgmodule-2.0 -lgthread-2.0 -lgtk-3 -lgdk-3 -lpangocairo-1.0 -lpango-1.0 -lharfbuzz -latk-1.0 -lcairo-gobject -lcairo -lgdk_pixbuf-2.0 -lgio-2.0 -lgobject-2.0 -lglib-2.0 -lm -lz

When the linker sees -fsanitize=addressthe parameter, it will link to compiler-rta compiled libclang_rt.asan*library seen earlier based on the compiled target architecture. For the OpenRTCClient sample application , when linking the executable file, the parameter loop_connectis passed in , so that all library files and header files required for compilation and linking will be found in the path specified by the parameter. Specifically, linking will link to the library file corresponding to the target architecture in the or directory .--sysroot--sysrootloop_connectOpenRTCClient/build_system/llvm-build/linux/linux/Release+Asserts/lib/clang/14.0.0/lib/linuxOpenRTCClient/build_system/llvm-build/linux/linux/Release+Asserts/lib/clang/14.0.0/lib/x86_64-unknown-linux-gnulibclang_rt.asan*

In order to debug AddressSanitizer, we need to let the linker link with our compiled compiler-rtlibrary. The specific method is to change OpenRTCClient/build_system/llvm-build/linux/linux/Release+Asserts/lib/clang/14.0.0/lib/linuxand OpenRTCClient/build_system/llvm-build/linux/linux/Release+Asserts/lib/clang/14.0.0/lib/x86_64-unknown-linux-gnuto other names at will, and at the same time OpenRTCClient/build_system/llvm-build/linux/linux/Release+Asserts/lib/clang/14.0.0/lib/create a symbolic linuxlink named in the directory to point to compiler-rtthe directory where we compiled llvm-project/build-compiler-rt/lib/linux. In this way, if we modify compiler-rtthe code, compile compiler-rt, and then link loop_connect, our modified compiler-rtcode will be linked in.

AddressSanitizer cannot find symbolizer problem analysis

Looking for the prompt message given by AddressSanitizer and compiler-rtsearching for the constant string in the code "WARNING: invalid path to external symbolizer!", we can find that it is located llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_posix_libcdep.cpp. The relevant code is as follows:

bool SymbolizerProcess::StartSymbolizerSubprocess() {
  if (!FileExists(path_)) {
    if (!reported_invalid_path_) {
      Report("WARNING: invalid path to external symbolizer!\n");
      reported_invalid_path_ = true;
    }
    return false;
  }

  const char *argv[kArgVMax];
  GetArgV(path_, argv);
  pid_t pid;

We can modify the code here to check the address of the symbolizer that AddressSanitizer sees here path_. As you can see, the address of the symbolizer here path_is specific /media/data/multimedia/OpenRTCClient/build/linux/x64/debug//../../third_party/llvm-build/Release+Asserts/bin/llvm-symbolizer. This value seems to ASAN_SYMBOLIZER_PATHhave nothing to do with the address we configured through environment variables.

path_The value is SymbolizerProcesspassed in the constructor of the class. The specific code is as follows:

SymbolizerProcess::SymbolizerProcess(const char *path, bool use_posix_spawn)
    : path_(path),
      input_fd_(kInvalidFd),
      output_fd_(kInvalidFd),
      times_restarted_(0),
      failed_to_start_(false),
      reported_invalid_path_(false),
      use_posix_spawn_(use_posix_spawn) {
  CHECK(path_);
  CHECK_NE(path_[0], '\0');
}

Throw our executable file into GDB for execution, SymbolizerProcessadd a breakpoint in the constructor of the class, and you can see the following call stack:

#0  __sanitizer::SymbolizerProcess::SymbolizerProcess(char const*, bool)
    (use_posix_spawn=false, path=0x7ffff3403000 "/media/data/multimedia/OpenRTCClient/build/linux/x64/debug//../../third_party/llvm-build/Release+Asserts/bin/llvm-symbolizer", this=0x7ffff7fab000) at ~llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_libcdep.cpp:456
#1  __sanitizer::LLVMSymbolizerProcess::LLVMSymbolizerProcess(char const*)
    (path=0x7ffff3403000 "/media/data/multimedia/OpenRTCClient/build/linux/x64/debug//../../third_party/llvm-build/Release+Asserts/bin/llvm-symbolizer", this=0x7ffff7fab000) at ~llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_libcdep.cpp:240
#2  __sanitizer::LLVMSymbolizer::LLVMSymbolizer(char const*, __sanitizer::LowLevelAllocator*)
    (this=0x7ffff7fb4000, path=0x7ffff3403000 "/media/data/multimedia/OpenRTCClient/build/linux/x64/debug//../../third_party/llvm-build/Release+Asserts/bin/llvm-symbolizer", allocator=<optimized out>) at ~llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_libcdep.cpp:292
#3  0x0000555556c45532 in __sanitizer::ChooseExternalSymbolizer (allocator=<optimized out>)
    at ~llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_common.h:1075
#4  __sanitizer::ChooseSymbolizerTools (allocator=<optimized out>, list=<synthetic pointer>)
    at ~llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_posix_libcdep.cpp:487
#5  __sanitizer::Symbolizer::PlatformInit() ()
    at ~llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_posix_libcdep.cpp:500
#6  0x0000555556c42455 in __sanitizer::Symbolizer::GetOrInit() ()
    at ~llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_libcdep.cpp:24
#7  0x0000555556c457ad in __sanitizer::Symbolizer::LateInitialize() ()
    at ~llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_posix_libcdep.cpp:505
#8  0x0000555556c199fd in __asan::AsanInitInternal() () at ~llvm-project/compiler-rt/lib/asan/asan_rtl.cpp:495
#9  0x00007ffff7fe0ce6 in  () at /lib64/ld-linux-x86-64.so.2
#10 0x00007ffff7fd013a in  () at /lib64/ld-linux-x86-64.so.2
#11 0x0000000000000001 in  ()

In the function llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_posix_libcdep.cppdefined in the file __sanitizer::ChooseExternalSymbolizer (), we can see the source SymbolizerProcessof the object path_:

static SymbolizerTool *ChooseExternalSymbolizer(LowLevelAllocator *allocator) {
  const char *path = common_flags()->external_symbolizer_path;

  if (path && internal_strchr(path, '%')) {
    char *new_path = (char *)InternalAlloc(kMaxPathLength);
    SubstituteForFlagValue(path, new_path, kMaxPathLength);
    path = new_path;
  }

  const char *binary_name = path ? StripModuleName(path) : "";
  static const char kLLVMSymbolizerPrefix[] = "llvm-symbolizer";
  if (path && path[0] == '\0') {
    VReport(2, "External symbolizer is explicitly disabled.\n");
    return nullptr;
  } else if (!internal_strncmp(binary_name, kLLVMSymbolizerPrefix,
                               internal_strlen(kLLVMSymbolizerPrefix))) {
    VReport(2, "Using llvm-symbolizer at user-specified path: %s\n", path);
    return new(*allocator) LLVMSymbolizer(path, allocator);
  } else if (!internal_strcmp(binary_name, "atos")) {
#if SANITIZER_MAC
    VReport(2, "Using atos at user-specified path: %s\n", path);
    return new(*allocator) AtosSymbolizer(path, allocator);
#else  // SANITIZER_MAC
    Report("ERROR: Using `atos` is only supported on Darwin.\n");
    Die();
#endif  // SANITIZER_MAC
  } else if (!internal_strcmp(binary_name, "addr2line")) {
    VReport(2, "Using addr2line at user-specified path: %s\n", path);
    return new(*allocator) Addr2LinePool(path, allocator);
  } else if (path) {
    Report("ERROR: External symbolizer path is set to '%s' which isn't "
           "a known symbolizer. Please set the path to the llvm-symbolizer "
           "binary or other known tool.\n", path);
    Die();
  }

  // Otherwise symbolizer program is unknown, let's search $PATH
  CHECK(path == nullptr);
#if SANITIZER_MAC
  if (const char *found_path = FindPathToBinary("atos")) {
    VReport(2, "Using atos found at: %s\n", found_path);
    return new(*allocator) AtosSymbolizer(found_path, allocator);
  }
#endif  // SANITIZER_MAC
  if (const char *found_path = FindPathToBinary("llvm-symbolizer")) {
    VReport(2, "Using llvm-symbolizer found at: %s\n", found_path);
    return new(*allocator) LLVMSymbolizer(found_path, allocator);
  }
  if (common_flags()->allow_addr2line) {
    if (const char *found_path = FindPathToBinary("addr2line")) {
      VReport(2, "Using addr2line found at: %s\n", found_path);
      return new(*allocator) Addr2LinePool(found_path, allocator);
    }
  }
  return nullptr;
}

In __sanitizer::ChooseExternalSymbolizer ()this function, AddressSanitizer will try to common_flags()->external_symbolizer_pathdetermine the path to the symbolizer program based on the equality. common_flags()->external_symbolizer_pathWe can see that the actual value of here is , and the object of the object %d/../../third_party/llvm-build/Release+Asserts/bin/llvm-symbolizerseen above is calculated based on this value.SymbolizerProcesspath_

In llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_flags.hthe file, common_flags()the function is defined as:

// Functions to get/set global CommonFlags shared by all sanitizer runtimes:
extern CommonFlags common_flags_dont_use;
inline const CommonFlags *common_flags() {
  return &common_flags_dont_use;
}

inline void SetCommonFlagsDefaults() {
  common_flags_dont_use.SetDefaults();
}

// This function can only be used to setup tool-specific overrides for
// CommonFlags defaults. Generally, it should only be used right after
// SetCommonFlagsDefaults(), but before ParseCommonFlagsFromString(), and
// only during the flags initialization (i.e. before they are used for
// the first time).
inline void OverrideCommonFlags(const CommonFlags &cf) {
  common_flags_dont_use.CopyFrom(cf);
}

That is, common_flags()the function returns a global object. The value of this global object is mainly updated by the function llvm-project/compiler-rt/lib/asan/asan_flags.cppin the file . The definition of this function is as follows:InitializeFlags()

void InitializeFlags() {
  // Set the default values and prepare for parsing ASan and common flags.
  SetCommonFlagsDefaults();
  {
    CommonFlags cf;
    cf.CopyFrom(*common_flags());
    cf.detect_leaks = cf.detect_leaks && CAN_SANITIZE_LEAKS;
    cf.external_symbolizer_path = GetEnv("ASAN_SYMBOLIZER_PATH");
    cf.malloc_context_size = kDefaultMallocContextSize;
    cf.intercept_tls_get_addr = true;
    cf.exitcode = 1;
    OverrideCommonFlags(cf);
  }
  Flags *f = flags();
  f->SetDefaults();

  FlagParser asan_parser;
  RegisterAsanFlags(&asan_parser, f);
  RegisterCommonFlags(&asan_parser);

  // Set the default values and prepare for parsing LSan and UBSan flags
  // (which can also overwrite common flags).
#if CAN_SANITIZE_LEAKS
  __lsan::Flags *lf = __lsan::flags();
  lf->SetDefaults();

  FlagParser lsan_parser;
  __lsan::RegisterLsanFlags(&lsan_parser, lf);
  RegisterCommonFlags(&lsan_parser);
#endif

#if CAN_SANITIZE_UB
  __ubsan::Flags *uf = __ubsan::flags();
  uf->SetDefaults();

  FlagParser ubsan_parser;
  __ubsan::RegisterUbsanFlags(&ubsan_parser, uf);
  RegisterCommonFlags(&ubsan_parser);
#endif

  if (SANITIZER_MAC) {
    // Support macOS MallocScribble and MallocPreScribble:
    // <https://developer.apple.com/library/content/documentation/Performance/
    // Conceptual/ManagingMemory/Articles/MallocDebug.html>
    if (GetEnv("MallocScribble")) {
      f->max_free_fill_size = 0x1000;
    }
    if (GetEnv("MallocPreScribble")) {
      f->malloc_fill_byte = 0xaa;
    }
  }

  // Override from ASan compile definition.
  const char *asan_compile_def = MaybeUseAsanDefaultOptionsCompileDefinition();
  asan_parser.ParseString(asan_compile_def);

  // Override from user-specified string.
  const char *asan_default_options = __asan_default_options();
  asan_parser.ParseString(asan_default_options);
#if CAN_SANITIZE_UB
  const char *ubsan_default_options = __ubsan_default_options();
  ubsan_parser.ParseString(ubsan_default_options);
#endif
#if CAN_SANITIZE_LEAKS
  const char *lsan_default_options = __lsan_default_options();
  lsan_parser.ParseString(lsan_default_options);
#endif

  // Override from command line.
  asan_parser.ParseStringFromEnv("ASAN_OPTIONS");
#if CAN_SANITIZE_LEAKS
  lsan_parser.ParseStringFromEnv("LSAN_OPTIONS");
#endif
#if CAN_SANITIZE_UB
  ubsan_parser.ParseStringFromEnv("UBSAN_OPTIONS");
#endif

  InitializeCommonFlags();

  // TODO(eugenis): dump all flags at verbosity>=2?
  if (Verbosity()) ReportUnrecognizedFlags();

  if (common_flags()->help) {
    // TODO(samsonov): print all of the flags (ASan, LSan, common).
    asan_parser.PrintFlagDescriptions();
  }

  // Flag validation:
  if (!CAN_SANITIZE_LEAKS && common_flags()->detect_leaks) {
    Report("%s: detect_leaks is not supported on this platform.\n",
           SanitizerToolName);
    Die();
  }
  // Ensure that redzone is at least ASAN_SHADOW_GRANULARITY.
  if (f->redzone < (int)ASAN_SHADOW_GRANULARITY)
    f->redzone = ASAN_SHADOW_GRANULARITY;
  // Make "strict_init_order" imply "check_initialization_order".
  // TODO(samsonov): Use a single runtime flag for an init-order checker.
  if (f->strict_init_order) {
    f->check_initialization_order = true;
  }
  CHECK_LE((uptr)common_flags()->malloc_context_size, kStackTraceMax);
  CHECK_LE(f->min_uar_stack_size_log, f->max_uar_stack_size_log);
  CHECK_GE(f->redzone, 16);
  CHECK_GE(f->max_redzone, f->redzone);
  CHECK_LE(f->max_redzone, 2048);
  CHECK(IsPowerOfTwo(f->redzone));
  CHECK(IsPowerOfTwo(f->max_redzone));

  // quarantine_size is deprecated but we still honor it.
  // quarantine_size can not be used together with quarantine_size_mb.
  if (f->quarantine_size >= 0 && f->quarantine_size_mb >= 0) {
    Report("%s: please use either 'quarantine_size' (deprecated) or "
           "quarantine_size_mb, but not both\n", SanitizerToolName);
    Die();
  }
  if (f->quarantine_size >= 0)
    f->quarantine_size_mb = f->quarantine_size >> 20;
  if (f->quarantine_size_mb < 0) {
    const int kDefaultQuarantineSizeMb =
        (ASAN_LOW_MEMORY) ? 1UL << 4 : 1UL << 8;
    f->quarantine_size_mb = kDefaultQuarantineSizeMb;
  }
  if (f->thread_local_quarantine_size_kb < 0) {
    const u32 kDefaultThreadLocalQuarantineSizeKb =
        // It is not advised to go lower than 64Kb, otherwise quarantine batches
        // pushed from thread local quarantine to global one will create too
        // much overhead. One quarantine batch size is 8Kb and it  holds up to
        // 1021 chunk, which amounts to 1/8 memory overhead per batch when
        // thread local quarantine is set to 64Kb.
        (ASAN_LOW_MEMORY) ? 1 << 6 : FIRST_32_SECOND_64(1 << 8, 1 << 10);
    f->thread_local_quarantine_size_kb = kDefaultThreadLocalQuarantineSizeKb;
  }
  if (f->thread_local_quarantine_size_kb == 0 && f->quarantine_size_mb > 0) {
    Report("%s: thread_local_quarantine_size_kb can be set to 0 only when "
           "quarantine_size_mb is set to 0\n", SanitizerToolName);
    Die();
  }
  if (!f->replace_str && common_flags()->intercept_strlen) {
    Report("WARNING: strlen interceptor is enabled even though replace_str=0. "
           "Use intercept_strlen=0 to disable it.");
  }
  if (!f->replace_str && common_flags()->intercept_strchr) {
    Report("WARNING: strchr* interceptors are enabled even though "
           "replace_str=0. Use intercept_strchr=0 to disable them.");
  }
  if (!f->replace_str && common_flags()->intercept_strndup) {
    Report("WARNING: strndup* interceptors are enabled even though "
           "replace_str=0. Use intercept_strndup=0 to disable them.");
  }
}

In InitializeFlags()the function, CommonFlags common_flags_dont_usethe default value will be set first, and then some values ​​will be obtained from the environment variables to update, that is, the environment variables we configured, ASAN_SYMBOLIZER_PATHand then overwritten according to the options obtained from MaybeUseAsanDefaultOptionsCompileDefinition(), __asan_default_options()and other functions, and from ASAN_OPTIONSand other environment variables. The previous settings.

Here, we print ASAN_SYMBOLIZER_PATHthe value obtained from the environment variable and find that it is the value we configured /usr/bin/llvm-symbolizer-11. The function definition llvm-project/compiler-rt/lib/asan/asan_flags.cppin the file is as follows:MaybeUseAsanDefaultOptionsCompileDefinition()

static const char *MaybeUseAsanDefaultOptionsCompileDefinition() {
#ifdef ASAN_DEFAULT_OPTIONS
  return SANITIZER_STRINGIFY(ASAN_DEFAULT_OPTIONS);
#else
  return "";
#endif
}

llvm-project/compiler-rt/lib/asan/asan_flags.cpp__asan_default_options()The function definition in the file is as follows:

SANITIZER_INTERFACE_WEAK_DEF(const char*, __asan_default_options, void) {
  return "";
}

Intuitively, the configuration options returned by these two functions will not be updated common_flags()->external_symbolizer_path. But in fact, after __asan_default_options()processing the return value of the function, common_flags()->external_symbolizer_paththe value of is updated to %d/../../third_party/llvm-build/Release+Asserts/bin/llvm-symbolizer. And __asan_default_options()the string actually returned by the function is not __asan_default_options()the empty string in the function definition we saw above, but the following string:

check_printf=1 use_sigaltstack=1 strip_path_prefix=/../../ fast_unwind_on_fatal=1 detect_stack_use_after_return=1 symbolize=1 detect_leaks=0 allow_user_segv_handler=1 external_symbolizer_path=%d/../../third_party/llvm-build/Release+Asserts/bin/llvm-symbolizer

We throw our executable into GDB
and run it, __asan_default_options()adding a breakpoint to the function. To our surprise, the breakpoint location was not added in llvm-project/compiler-rt/lib/asan/asan_flags.cppthe file, but in the WebRTC code webrtc/build/sanitizers/sanitizer_options.cc:

(gdb) break __asan_default_options 
warning: Could not find DWO CU obj/build/config/sanitizers/options_sources/sanitizer_options.dwo(0x4d51bcd290d078c0) referenced by CU at offset 0x43f087 [in module /media/data/multimedia/OpenRTCClient/build/linux/x64/debug/loop_connect]
Breakpoint 1 at 0x81c6e34: file ../../../../webrtc/build/sanitizers/sanitizer_options.cc, line 75.

Let's look at __asan_default_options()the declaration and definition of the function again and find that it is defined as a weak symbol in llvm. In the WebRTC code, webrtc/build/sanitizers/sanitizer_options.ccthere is __asan_default_options()a function definition as follows:

#if defined(ADDRESS_SANITIZER)
// Default options for AddressSanitizer in various configurations:
//   check_printf=1 - check the memory accesses to printf (and other formatted
//     output routines) arguments.
//   use_sigaltstack=1 - handle signals on an alternate signal stack. Useful
//     for stack overflow detection.
//   strip_path_prefix=/../../ - prefixes up to and including this
//     substring will be stripped from source file paths in symbolized reports
//   fast_unwind_on_fatal=1 - use the fast (frame-pointer-based) stack unwinder
//     to print error reports. V8 doesn't generate debug info for the JIT code,
//     so the slow unwinder may not work properly.
//   detect_stack_use_after_return=1 - use fake stack to delay the reuse of
//     stack allocations and detect stack-use-after-return errors.
//   symbolize=1 - enable in-process symbolization.
//   external_symbolizer_path=... - provides the path to llvm-symbolizer
//     relative to the main executable
#if defined(OS_LINUX) || defined(OS_CHROMEOS)
const char kAsanDefaultOptions[] =
    "check_printf=1 use_sigaltstack=1 strip_path_prefix=/../../ "
    "fast_unwind_on_fatal=1 detect_stack_use_after_return=1 "
    "symbolize=1 detect_leaks=0 allow_user_segv_handler=1 "
    "external_symbolizer_path=%d/../../third_party/llvm-build/Release+Asserts/"
    "bin/llvm-symbolizer";

#elif defined(OS_APPLE)
const char* kAsanDefaultOptions =
    "check_printf=1 use_sigaltstack=1 strip_path_prefix=/../../ "
    "fast_unwind_on_fatal=1 detect_stack_use_after_return=1 ";

#elif defined(OS_WIN)
const char* kAsanDefaultOptions =
    "check_printf=1 use_sigaltstack=1 strip_path_prefix=\\..\\..\\ "
    "fast_unwind_on_fatal=1 detect_stack_use_after_return=1 "
    "symbolize=1 external_symbolizer_path=%d/../../third_party/"
    "llvm-build/Release+Asserts/bin/llvm-symbolizer.exe";
#endif  // defined(OS_LINUX) || defined(OS_CHROMEOS)

#if defined(OS_LINUX) || defined(OS_CHROMEOS) || defined(OS_APPLE) || \
    defined(OS_WIN)
// Allow NaCl to override the default asan options.
extern const char* kAsanDefaultOptionsNaCl;
__attribute__((weak)) const char* kAsanDefaultOptionsNaCl = nullptr;

SANITIZER_HOOK_ATTRIBUTE const char *__asan_default_options() {
  if (kAsanDefaultOptionsNaCl)
    return kAsanDefaultOptionsNaCl;
  return kAsanDefaultOptions;
}

extern char kASanDefaultSuppressions[];

SANITIZER_HOOK_ATTRIBUTE const char *__asan_default_suppressions() {
  return kASanDefaultSuppressions;
}
#endif  // defined(OS_LINUX) || defined(OS_CHROMEOS) || defined(OS_APPLE) ||
        // defined(OS_WIN)
#endif  // ADDRESS_SANITIZER

At this point it is not difficult to confirm that the symbolizer we ASAN_SYMBOLIZER_PATHconfigured through environment variables is overridden by the configuration options in the WebRTC code.

Relevant changes in WebRTC were https://chromium.googlesource.com/chromium/src/build/+/919d061c2f455cc07b687a48322785b3b61f1455%5E%21/sanitizers/sanitizer_options.ccsubmitted in this commit.

For this problem, the solution is not difficult to confirm. webrtc/build/sanitizers/sanitizer_options.ccJust remove the part of the WebRTC code that configures the symbolizer of AddressSanitizer.

Reference documentation

“compiler-rt” runtime libraries

Guess you like

Origin blog.csdn.net/tq08g2z/article/details/124525987