PDM Series Catalog
1. The epoch-making Python package manager: PDM tutorial—Introduction
2. The epoch-making Python package manager: PDM tutorial—principle
3. The epoch-making Python package manager—PDM local & global configuration
4. The epoch-making Python package manager - PDM local & global project
5. The epoch-making Python package manager - PDM caching mechanism
6. Reader asked: How to make PyCharm support PDM?
pdm introduced the local package directory of pep 582, and many people are questioning: each project is under its own project directory, what is the difference between that and the venv virtual environment?
Many people do not have a deep understanding of virtual environment and pep 582, so it is normal to have this question.
First of all, the first difference is that the virtual environment has its own Python interpreter, while pep 582 does not add a new Python interpreter, so pep 582 is more lightweight.
Then, the second difference is our core content today, the support of the pdm caching mechanism.
If multiple pdm projects depend on the same python package of the same version, under normal circumstances, each project will save a copy to its own __pypackages__
directory .
But there are several problems with this:
- waste of disk space
- Installation is slow
You may think that disks are the cheapest hardware now, and it doesn't matter if you waste it, but some Python projects have more dependencies than you can imagine. For example, OpenStack, the world's largest Python project, has thousands of dependencies. Even if you don't feel bad about your disk, your time must be precious, right?
You create a new pdm project, and you have to reinstall so many dependent packages, and you can't get it done in a day. Then you will know the importance of caching.
1. Enable cache
pdm 默认是关闭 cache 的,如有需要,可以通过如下命令进行开启
$ pdm config install.cache on
复制代码
与缓存相关的配置有三个
- install.cache:是否开启缓存
- install.cache_method:选择连接缓存的方式
- cache_dir:指定缓存的存放目录
关于 cache_dir 如无特殊需要,可以不用管,用默认的目录即可
/Users/iswbm/Library/Caches/pdm
复制代码
比较难以理解的,值得一讲的是 install.cache_method,它的值有两种:
- symlink:以软链接的方式连接
- pth:以 pth 的方式连接
关于它们的区别,我在后边有详细的讲解,请继续往下
2. 简单示例
这边以一个简单的示例,让你了解缓存的工作原理。
首先我创建两个 pdm 项目
# 初始化第一个 pdm 项目
mkdir pdm-demo1 && cd pdm-demo1
pdm init
# 初始化第二个 pdm 项目
mkdir pdm-demo2 && cd pdm-demo2
pdm init
复制代码
在 pdm-demo1 下,安装 typer 的包
pdm add typer
复制代码
然后进入 python 交互式解释器,试着导入一下,查看导入的 typer 包路径是什么?
可以发现,存放的目录正是 cache_dir 所配置的目录
然后进入 pdm-demo2 下,同样安装 typer 包
pdm add typer
复制代码
同样进入 python 交互式解释器,试着导入一下,查看导入的 typer 包路径是什么?
可以发现,导入的 typer 与之前 pdm-demo1 的路径一致,说明这两个项目用的同一个 typer 包,避免了同个包同个版本的重复安装。
3. 缓存的原理
关于缓存原理,其实并不难,对于不同的 install.cache_method 原理也不一样
cache_method=symlink
symlink 是默认的连接方式,也是最好理解的一种方式。
当你安装了 typer 包后,在本地包目录下就可以看到 typer 通过一个软链接的方式指向了缓存目录下的 typer 包
cache_method=pth
对于 .pth
相信有不少人不清楚它的用法和原理,这里简单提一下。
When Python is traversing the known library file directory, if it finds a .pth file, it will add the path recorded in the file to the sys.path setting, so the library specified in the .pth file can also be used by Python Runtime found.
Focus back to pdm, if you use the cache_method=pth mode, every time you install a package, a .pth
file , which records the lib directory of the package to be cached.
In this way, when Python looks for packages in the __pypackages__
directory , once it finds a .pth
file, it will add the path recorded in the .pth
file to sys.path.
In the above example, looking at the __pypackages__
directory , you can find that there are many aaa_xxx.pth files, and the content of these files is the lib directory of the corresponding package in our cache directory
4. Cache management
The command help for pdm management cache is as follows
- pdm cache clear: clear all caches
- pdm cache info: View all cache information
- pdm remove [pattern]: remove the matched file
- pdm cache list: List all wheel files in the cache