Pre-holiday super dry goods welfare broadcast! This is probably the most useful Conan management dependency post

The kernel part of Milvus is written in C++, and C++ dependency management has always been a huge pain point for developers, and it is also a bottleneck restricting the development of C++ ecosystem.

In the early days, Milvus automatically downloaded dependencies through CMake built-in methods such as FetchContent and ExternalProject, which are sufficient in most cases. However, as the capabilities of the Milvus kernel become more and more abundant, there are more and more dependencies. For example, if you want to join Folly and use it to optimize After the thread pool and data structure, it is necessary to introduce opentelemetry-cpp to enhance observability, etc.

This brings certain problems, the compilation time is getting longer and longer, the dependent packages have recursive dependencies and cannot be reused with each other, and the process of adding new dependencies every time is extremely painful. This series of problems urgently needs a dependency management tool. After investigating Conan, vcpkg, bazel and other tools, we finally chose Conan, which has a complete ecosystem and is the best compatible with CMake, to manage dependencies.

At present, the C++ projects in the Milvus community are using Conan to manage dependencies. During the transformation process, they encountered some unavoidable pitfalls. This article will sort out some common concepts, usages and common problems in the process of using Conan for your convenience. understand.

01. Common usage of Conan

Installation Tutorial

Conan released version 2.0 in March 2023, but some third-party packages of 2.0 have not been fully migrated, so Conan version 1.58.0 is still used in Milvus, and will try to upgrade to version 2.0 in the future.

Conan is a program written in python3, which can be installed through pip:

pip install conan==1.58.0

Using Principles in Milvus

After executing make, Milvus will automatically call Conan to download and install dependencies. The details are as follows:

  • Run conan install in scripts/core_build.sh to download and compile dependencies:
case "${unameOut}" in
  Darwin*)
    conan install ${CPP_SRC_DIR} --install-folder conan --build=missing -s compiler=clang -s compiler.version=${llvm_version} -s compiler.libcxx=libc++ -s compiler.cppstd=17 || { echo 'conan install failed'exit 1; }
    ;;
  Linux*)
    GCC_VERSION=`${CC} -dumpversion`
    if [[ `${CC} -v 2>&1 | sed -n 's/.*\(--with-default-libstdcxx-abi\)=\(\w*\).*/\2/p'` == "gcc4" ]]; then
      conan install ${CPP_SRC_DIR} --install-folder conan --build=missing -s compiler.version=${GCC_VERSION} || { echo 'conan install failed'exit 1; }
    else
      conan install ${CPP_SRC_DIR} --install-folder conan --build=missing -s compiler.version=${GCC_VERSION} -s compiler.libcxx=libstdc++11 || { echo 'conan install failed'exit 1; }
    fi
    ;;
  *)
    echo "Cannot build on windows"
    ;;
esac
  • Build dependencies' configuration in cmake_build/conan directory.

  • 在 core/CMakeLists.txt 中 include 生成的配置信息,即可使用 Conan 中定义的第三方依赖:

list( APPEND CMAKE_MODULE_PATH ${CMAKE_BINARY_DIR}/conan )
include( ${CMAKE_BINARY_DIR}/conan/conanbuildinfo.cmake )

Conan 的 Profile

Profile 是 Conan 的重要配置,该配置决定了 Conan 在编译第三方依赖时的参数,包括编译器版本、C++ 版本等。

Conan 会根据 profile + option 决定是否编译依赖,如果 profile + option 在 conan center 中有预编译好的二进制文件,则直接下载使用,否则会从源码编译。

在 ~/.conan/profiles/default 有默认配置,例如:

[settings]
os=Macos
os_build=Macos
arch=armv8
arch_build=armv8
compiler=clang
compiler.version=15
compiler.libcxx=libc++  # libcxx so的版本,有是否支持cxx11的区别
compiler.cppstd=17
build_type=Release
[options]
[build_requires]

在 Milvus 的 Conanfile.py 中,改了默认的 arrow 编译配置,所以 arrow 必然会重新编译:

class MilvusConan(ConanFile):

    settings = "os""compiler""build_type""arch"
    requires = (
        "arrow/8.0.1",
    )
    generators = ("cmake""cmake_find_package")
    default_options = {
        "arrow:with_zstd"True,
        "arrow:shared"False,
        "arrow:with_jemalloc"True,
    }


第三方包装在哪里?

以 arrow 为例,它会装在下方目录中,其中文件路径中的 hash 值是根据 profile+option 算出来的,所以修改 profile 或 option 后会重新生成一个 package。

02.如何写 conanfile.py

可以参考 internal/core/conanfile.py:

class MilvusConan(ConanFile):

    settings = "os""compiler""build_type""arch"
    # 去 https://conan.io/center/ 搜索需要的package及其版本
    requires = (
        "rocksdb/6.29.5",
        "boost/1.81.0",
        "onetbb/2021.7.0",
        "nlohmann_json/3.11.2",
        "zstd/1.5.5",
        # ...
    )
    generators = ("cmake""cmake_find_package")
    default_options = {
        "rocksdb:shared"True,
        # ...
        
    }

    # 根据settings动态决定依赖的编译配置
    def configure(self):
        if self.settings.os == "Macos":
            # Macos M1 cannot use jemalloc
            if self.settings.arch not in ("x86_64""x86"):
                del self.options["folly"].use_sse4_2
    # imports 会把匹配的文件放到 cmake_build/ 下           
    def imports(self):
        self.copy("*.dylib""../lib""lib")
        self.copy("*.dll""../lib""lib")
        self.copy("*.so*""../lib""lib")
        self.copy("*""../bin""bin")
        self.copy("*.proto""../include""include")

03.如何写入及发布 Library 的 conanfile.py ?

相比于只是使用 Conan 管理依赖,写一个 library 的 conanfile.py 要复杂很多,它不光要定义依赖项,给用户提供多种编译选项,还要声明导出的包各种定义。

参考 Knowhere 的 conanfile.py:

class KnowhereConan(ConanFile):
    name = "knowhere"
    description = "Knowhere is written in C++. It is an independent project that act as Milvus's internal core"
    topics = ("vector""simd""ann")
    url = "https://github.com/milvus-io/knowhere"
    homepage = "https://github.com/milvus-io/knowhere"
    license = "Apache-2.0"

    generators = "pkg_config"

    settings = "os""arch""compiler""build_type"
    # 需要指定option和它的默认值
    options = {
        "shared": [TrueFalse],
        "fPIC": [TrueFalse],
        "with_raft": [TrueFalse],
        "with_asan": [TrueFalse],
        "with_diskann": [TrueFalse],
        "with_profiler": [TrueFalse],
        "with_ut": [TrueFalse],
        "with_benchmark": [TrueFalse],
    }
    default_options = {
        "shared"True,
        "fPIC"False,
        "with_raft"False,
        "with_asan"False,
        "with_diskann"False,
        "with_profiler"False,
        "with_ut"False,
        "glog:with_gflags"False,
        "prometheus-cpp:with_pull"False,
        "with_benchmark"False,
    }

    # 发布的源码包包含哪些文件
    exports_sources = (
        "src/*",
        "thirdparty/*",
        "tests/ut/*",
        "include/*",
        "CMakeLists.txt",
        "*.cmake",
        "conanfile.py",
    )

    @property
    def _minimum_cpp_standard(self):
        return 17

    @property
    def _minimum_compilers_version(self):
        return {
            "gcc""8",
            "Visual Studio""16",
            "clang""6",
            "apple-clang""10",
        }

    def config_options(self):
        if self.settings.os == "Windows":
            self.options.rm_safe("fPIC")

    def configure(self):
        if self.options.shared:
            self.options.rm_safe("fPIC")

    def requirements(self):
        self.requires("boost/1.81.0")
        self.requires("glog/0.6.0")
        self.requires("nlohmann_json/3.11.2")
        self.requires("openssl/1.1.1t")
        self.requires("prometheus-cpp/1.1.0")
        if self.options.with_ut:
            self.requires("catch2/3.3.1")
        if self.options.with_benchmark:
            self.requires("gtest/1.13.0")
            self.requires("hdf5/1.14.0")

    @property
    def _required_boost_components(self):
        return ["program_options"]

    def validate(self):
        if self.settings.compiler.get_safe("cppstd"):
            check_min_cppstd(self, self._minimum_cpp_standard)
        min_version = self._minimum_compilers_version.get(str(self.settings.compiler))
        if not min_version:
            self.output.warn(
                "{} recipe lacks information about the {} compiler support.".format(
                    self.name, self.settings.compiler
                )
            )
        else:
            if Version(self.settings.compiler.version) < min_version:
                raise ConanInvalidConfiguration(
                    "{} requires C++{} support. The current compiler {} {} does not support it.".format(
                        self.name,
                        self._minimum_cpp_standard,
                        self.settings.compiler,
                        self.settings.compiler.version,
                    )
                )

    def layout(self):
        cmake_layout(self)

    # 用于生成最关键的 cmake toolchain文件,cmake依赖项配置文件,以及cmake编译参数
    def generate(self):
        tc = CMakeToolchain(self)
        tc.variables["CMAKE_POSITION_INDEPENDENT_CODE"] = self.options.get_safe(
            "fPIC"True
        )
        # Relocatable shared lib on Macos
        tc.cache_variables["CMAKE_POLICY_DEFAULT_CMP0042"] = "NEW"
        # Honor BUILD_SHARED_LIBS from conan_toolchain (see https://github.com/conan-io/conan/issues/11840)
        tc.cache_variables["CMAKE_POLICY_DEFAULT_CMP0077"] = "NEW"

        cxx_std_flag = tools.cppstd_flag(self.settings)
        cxx_std_value = (
            cxx_std_flag.split("=")[1]
            if cxx_std_flag
            else "c++{}".format(self._minimum_cpp_standard)
        )
        tc.variables["CXX_STD"] = cxx_std_value
        if is_msvc(self):
            tc.variables["MSVC_LANGUAGE_VERSION"] = cxx_std_value
            tc.variables["MSVC_ENABLE_ALL_WARNINGS"] = False
            tc.variables["MSVC_USE_STATIC_RUNTIME"] = "MT" in msvc_runtime_flag(self)
        tc.variables["WITH_ASAN"] = self.options.with_asan
        tc.variables["WITH_DISKANN"] = self.options.with_diskann
        tc.variables["WITH_RAFT"] = self.options.with_raft
        tc.variables["WITH_PROFILER"] = self.options.with_profiler
        tc.variables["WITH_UT"] = self.options.with_ut
        tc.variables["WITH_BENCHMARK"] = self.options.with_benchmark
        tc.generate()
        deps = CMakeDeps(self)
        deps.generate()

    def build(self):
        cmake = CMake(self)
        cmake.configure()
        cmake.build()

    def package(self):
        cmake = CMake(self)
        cmake.install()
        files.rmdir(self, os.path.join(self.package_folder, "lib""cmake"))
        files.rmdir(self, os.path.join(self.package_folder, "lib""pkgconfig"))

    def package_info(self):
        self.cpp_info.set_property("cmake_file_name""knowhere")
        self.cpp_info.set_property("cmake_target_name""Knowhere::knowhere")
        self.cpp_info.set_property("pkg_config_name""libknowhere")

        self.cpp_info.components["libknowhere"].libs = ["knowhere"]

        self.cpp_info.components["libknowhere"].requires = [
            "boost::program_options",
            "glog::glog",
            "prometheus-cpp::core",
            "prometheus-cpp::push",
        ]

        self.cpp_info.filenames["cmake_find_package"] = "knowhere"
        self.cpp_info.filenames["cmake_find_package_multi"] = "knowhere"
        self.cpp_info.names["cmake_find_package"] = "Knowhere"
        self.cpp_info.names["cmake_find_package_multi"] = "Knowhere"
        self.cpp_info.names["pkg_config"] = "libknowhere"
        self.cpp_info.components["libknowhere"].names["cmake_find_package"] = "knowhere"
        self.cpp_info.components["libknowhere"].names[
            "cmake_find_package_multi"
        ] = "knowhere"

        self.cpp_info.components["libknowhere"].set_property(
            "cmake_target_name""Knowhere::knowhere"
        )
        self.cpp_info.components["libknowhere"].set_property(
            "pkg_config_name""libknowhere"
        )

理论上无需修改原始的 CMakeLists.txt 文件,但部分第三方包名并不统一要做对应的修改。在 CMakeLists.txt 中直接添加 find_package(XXX required) 即可找到对应的包。

原理

以编译 Knowhere 为例:

在build目录下运行,可以添加一些自定参数,这些自定义参数需要定义在 conanfile.py 中。

conan install .. --build=missing -o with_ut=True -o with_asan=True -s build_type=Debug

运行上述命令即可将依赖包下载并编译,同时在 build/Debug/generators 下会生成重要的配置文件。再运行即可编译knowhere项目:

conan build ..

Conan build 命令本质上是运行了 cmake 命令,加了一些参数,约等于:

cmake -G "Unix Makefiles" -DCMAKE_TOOLCHAIN_FILE=./Debug/generators/conan_toolchain.cmake -DCMAKE_BUILD_TYPE="Debug" ..  

很多编辑器、IDE 会根据 CMakeLists.txt 文件自动配置环境。在使用 Conan 后,很多同学会遇到配置项目报错、无法使用的问题,此时需要修改 IDE 对应的 cmake 配置,加上 -DCMAKE_TOOLCHAIN_FILE=build/Debug/generators/conan_toolchain.cmake 参数即可完成环境配置。

如何写一个新包及测试?

https://github.com/milvus-io/conanfiles 里有几个例子,以其中的 arrow 为例,在 arrow/all 目录下执行:

conan create . arrow/12.0.0-dev1@milvus/dev --build=missing

如果编译成功,会在 ~/.conan/data/arrow 下生成对应的包。

如何上传到 center

Milvus 依赖的一些 lib 如 Knowhere、velox 等在 https://conan.io/center/ 中不存在或版本不符合要求,此时需要上传到私有的 center,拿到对应的用户名、密码并运行以下命令:

conan user -p $password -r default-conan-local $user
conan upload arrow/12.0.0-dev1@milvus/dev -r default-conan-local

至于如何搭建私有的center,详见:https://docs.conan.io/1/uploading_packages/remotes.html

本文由 mdnice 多平台发布

Guess you like

Origin blog.csdn.net/weixin_44839084/article/details/130424773