Python obtém a versão mais recente da imagem do dockerhub warehouse

A imagem mais recente em alguns docker warehouses representa apenas a última versão da imagem carregada, não a versão mais recente real do programa (como gitlab/gitlab-ce).Para obter automaticamente a imagem no docker warehouse que atenda aos nossos requisitos, nós escreveu um roteiro.

Análise de ideias

Para obter uma imagem que cumpra os requisitos, é obviamente necessário implementá-la em três partes:

Obtenha todas as tags de imagem
Filtre imagens que atendam aos requisitos
Extrair imagem

Entre eles, todas as tags de imagem podem ser obtidas simplesmente chamando a API, e a extração de imagens também pode ser realizada executando comandos locais. A filtragem de imagens requer a configuração de filtros.
Portanto, optou-se por implementar duas classes: filter, warehouse getter e duas funções de processo para carregar a configuração e executar o processo pull.

Código

Primeiro projete o arquivo de configuração. Se quiser realizar as funções acima, você precisa saber o endereço e as condições de filtragem do armazém espelho. As condições de filtragem incluem vários métodos de filtragem. O design é o seguinte:

libraries:
  - gitlab:
      repo: "gitlab/gitlab-ce"
      base_host: "" # 基础仓库链接，默认为dockerhub
      tag_filter: # 过滤器
        exclude: # tag中不出现的词
          - "14."
          - "latest"
        include: # tag中必须包含的词，任选其一
          - "15.11"
          - "15.10"
        py_code: # 作为参数输入后执行结果为true的tag，需要有名字
          - "lambda remote_tag: list(map(int, '15.11.0-ce.0'.replace('-ce', '').split('.'))) < list(map(int, remote_tag.replace('-ce', '').split('.')))"
        regex: # 符合正则表达式的tag，re.match(regex,str)返回true的str
          - ".*ee.*"

  - openvas:
      repo: "immauss/openvas"
      base_host: "" # 基础仓库链接，默认为dockerhub
      tag_filter: # 过滤器
        exclude:
          - "" # tag中不出现的词
        include:
          - "latest" # tag中必须包含的词
        py_code:
          - "" # 作为参数输入后执行结果为true的tag
        regex:
          - "" # 符合正则表达式的tag

Para o método de filtragem, uma breve explicação é a seguinte:

exclude: strings que não existem na tag
include: uma string contendo pelo menos 1 na tag
py_code: Uma função lambda cujo parâmetro é a tag da imagem obtida. Quando true é retornado, as condições de filtragem são atendidas; se houver múltiplas funções lambda, todas elas devem ser retornadas para passar na filtragem.
regex: expressão regular, apenas tags que satisfaçam todas as expressões regulares podem passar na filtragem

filtro

class Tag_Filter:
    exclude: list = []  # tag中不能出现的词，全都不允许出现
    include: list = []  # tag中必须包含的词，任一满足即可通过筛选
    py_code: list = []  # tag需要满足的python判断方法，满足所有判断方法才能通过筛选
    regex_filter: list = []  # 符合正则表达式的tag，需要满足所有正则表达式才能通过筛选

    def __init__(self) -> None:
        self.exclude = []
        self.include = []
        self.py_code = []
        self.regex_filter = []
        return

    # 定义一个类的构造函数，它接受一个 `tag_filter` 字典作为参数。
    # 在构造函数中，我们从 `tag_filter` 字典中获取了四个列表：`exclude`、`include`、`py_code` 和 `regex`。
    # 对于每个列表，我们都使用列表推导式来过滤掉空字符串，并将结果赋值给相应的实例变量。
    def __init__(self, tag_filter: dict) -> None:

        # 从 tag_filter 字典中获取 exclude 列表，并过滤掉值为空字符串的元素
        # Get the exclude list from the tag_filter dictionary and filter out element which is empty strings
        self.exclude = [
            word
            for word in list(tag_filter["exclude"])
            if word != ""
        ]

        # 从 tag_filter 字典中获取 include 列表，并过滤掉值为空字符串的元素
        # Get the include list from the tag_filter dictionary and filter out element which is empty strings
        self.include = [
            word
            for word in list(tag_filter["include"])
            if word != ""
        ]

        # 从 tag_filter 字典中获取 py_code 列表，并过滤掉值为空字符串的元素
        # Get the py_code list from the tag_filter dictionary and filter out element which is empty strings
        self.py_code = [
            code
            for code in list(tag_filter["py_code"])
            if code != ""
        ]

        # 从 tag_filter 字典中获取 regex 列表，并过滤掉值为空字符串的元素
        # Get the regex list from the tag_filter dictionary and filter out element which is empty strings
        self.regex_filter = [
            regex for regex in list(tag_filter["regex"])
            if regex != ""
        ]
        return

coletor de armazém

class Image_Getter:
    repo: str = ""
    base_host: str = ""
    tag_filter: Tag_Filter = None

    # Constract func by None
    def __init__(self) -> None:
        self.repo = ""
        self.base_host = ""
        self.tag_filter = None
        return

    # Constract func by config(type dict)
    def __init__(self, config: dict) -> None:
        self.repo = config["repo"]

        base_host = config['base_host']
        if not base_host.endswith("/") and base_host != "":
            base_host += "/"
        self.base_host = "https://registry.hub.docker.com/v2/repositories/" if base_host == "" else base_host

        self.tag_filter = Tag_Filter(config['tag_filter'])

        return

O acima é a parte da declaração e o construtor.
Em seguida, defina uma get_remote_tagsfunção membro chamada que será usada para obter tags remotas e filtrá-las. A função primeiro constrói uma URL e depois usa requestso módulo para obter a tag remota. As tags obtidas serão filtradas de acordo com expressões regulares, e somente as tags que atenderem às condições serão retidas. Em seguida, a função obtém as listas includee . Se e estiverem vazios, as tags filtradas serão retornadas diretamente. Caso contrário, a função filtrará ainda mais as tags e reterá apenas as tags que atenderem às condições de e .excludepy_codeincludeexcludeincludeexcludepy_code

    def get_remote_tags(self) -> list:
        # 构造获取仓库的标签的 URL
        # Construct the URL to get remote tags
        url = f"{
      
      self.base_host}{
      
      self.repo}/tags?page_size=50"

        # 获取远程仓库的标签并过滤
        # Get remote tags and filter them
        tags = [
            result['name']
            for result in requests.get(url).json()['results']
            if all(
                re.match(regex, str(result['name']))
                for regex in self.tag_filter.regex_filter
            )
        ]

        # 获取 include 和 exclude 列表以及 py_code 列表
        # Get the include and exclude lists and the py_code list
        include = self.tag_filter.include
        exclude = self.tag_filter.exclude
        py_codes = self.tag_filter.py_code

        # 如果 include 和 exclude 都为空，则直接返回 tags
        # If both include and exclude are empty, return tags directly
        if not include and not exclude:
            return tags
        else:
            # 过滤标签
            # Filter tags
            filtered_tags = []
            for tag in tags:
                if include and not any(i in tag for i in include):
                    continue
                if exclude and any(e in tag for e in exclude):
                    continue
                if py_codes and not all(eval(f"({
      
      py_code})('{
      
      tag}')") for py_code in py_codes):
                    continue
                filtered_tags.append(tag)
            return filtered_tags
        return []

Carregamento de configuração

O carregamento da configuração é muito simples, basta chamar a biblioteca yaml

def load_config(config_path: str = 'dockerhub.yaml') -> dict:
    with open(config_path, 'r') as f:
        config = yaml.load(f, Loader=yaml.FullLoader)
    return dict(config)

Puxar

Defina uma pull_latest_imagefunção chamada que é usada para extrair a imagem mais recente do Docker Hub.
A função primeiro carrega o arquivo de configuração e depois percorre cada repositório especificado no arquivo de configuração. Para cada repositório, a função usa Image_Gettera classe para obter a lista de imagens a serem extraídas.
Em seguida, a função irá percorrer a lista de imagens e usar subprocess.runa função para executar docker pullo comando para extrair cada imagem.
Durante o processo de extração, a função gerará informações de depuração, incluindo o nome da imagem que está sendo extraída, resultados da execução do comando e possíveis mensagens de erro.

def pull_latest_image(config_path: str = 'dockerhub.yaml') -> int:
    # 加载配置文件
    # Load the configuration file
    config = load_config(config_path)

    # 遍历配置文件中的每个库
    # Iterate over each library in the configuration file
    for library in config['libraries']:
        name = list(library.keys())[0]
        getter = Image_Getter(library[name])

        # 获取要拉取的镜像列表
        # Get the list of images to pull
        image_to_pull_list = [
            f"{
      
      getter.repo}:{
      
      tag}"
            for tag in getter.get_remote_tags()
        ]

        # 拉取镜像
        # Pull the images
        for image in image_to_pull_list:
            print(f"Pulling image: {
      
      image}")
            command = ["docker", "pull", image]
            result = subprocess.run(
                command,
                stdout=subprocess.PIPE,
                stderr=subprocess.PIPE
            )

            # 检查命令执行结果并输出调试信息
            # Check the command execution result and output debug information
            if result.returncode == 0:
                print(f"Successfully pulled image: {
      
      image}")
            else:
                print(
                    f"Failed to pull image: {
      
      image}. Error: {
      
      result.stderr.decode('utf-8')}")
                return result.returncode
    return 0

Resumir

O código acima e os arquivos de configuração realizam a extração da imagem do gitlab-ce superior a 15.11.0.Este código precisa instalar o pyyaml e solicitar módulos para funcionar sem problemas.

Observação

A parte dos comentários foi melhorada pelo new-bing.