Kubeflow等镜像部署到集群多节点

为了将Kubeflow(https://github.com/kubeflow/kubeflow)/Kubernetes等镜像放到本地集群部署或者更新,需要一系列的操作。如果集群的多个节点同时访问外部镜像服务,将带来较大的并发网络流量,不仅速度慢、而且费用增加。因此我将其分为两个阶段来进行,第一阶段将镜像下载到本地,第二阶段各个节点从本地文件系统或镜像服务来获取镜像的拷贝。

1、从 gcr到本地存储

这一阶段其实也分为两个步骤。

首先,从能够访问到gcr的站点(https://www.katacoda.com)下载。如:


echo ""
echo "================================================================="
echo "pull kubeflow images for system from gcr.io and hub.docker.com..."
echo "This tools created by openthings, NO WARANTY. 2018.07.10."
echo "================================================================="

echo ""
echo "1. centraldashboard"
docker pull gcr.io/kubeflow-images-public/centraldashboard:v0.2.1

echo ""
echo "2. jupyterhub-k8s"
docker pull gcr.io/kubeflow/jupyterhub-k8s:v20180531-3bb991b1

echo ""
echo "3. tf_operator"
docker pull gcr.io/kubeflow-images-public/tf_operator:v0.2.0 

echo ""
echo "4. ambassador"
docker pull quay.io/datawire/ambassador:0.30.1

echo ""
echo "5. redis"
docker pull redis:4.0.1 

echo ""
echo "6. seldonio/cluster-manager"
docker pull seldonio/cluster-manager:0.1.6

echo ""
echo "Finished."
echo ""

然后将镜像推送到国内站点(如阿里云-http://registry.cn-hangzhou.aliyuncs.com)。如


echo ""
echo "================================================================="
echo "Push kubeflow images for system to aliyun.com ..."
echo "This tools created by openthings, NO WARANTY. 2018.07.10."
echo "================================================================="

MY_REGISTRY=registry.cn-hangzhou.aliyuncs.com/openthings

echo ""
echo "1. centraldashboard"
docker tag gcr.io/kubeflow-images-public/centraldashboard:v0.2.1 ${MY_REGISTRY}/kubeflow-images-public-centraldashboard:v0.2.1
docker push ${MY_REGISTRY}/kubeflow-images-public-centraldashboard:v0.2.1

echo ""
echo "2. jupyterhub-k8s"
docker tag gcr.io/kubeflow/jupyterhub-k8s:v20180531-3bb991b1 ${MY_REGISTRY}/kubeflow-jupyterhub-k8s:v20180531-3bb991b1
docker push ${MY_REGISTRY}/kubeflow-jupyterhub-k8s:v20180531-3bb991b1

echo ""
echo "3. tf_operator"
docker tag gcr.io/kubeflow-images-public/tf_operator:v0.2.0 ${MY_REGISTRY}/kubeflow-images-public-tf_operator:v0.2.0
docker push ${MY_REGISTRY}/kubeflow-images-public-tf_operator:v0.2.0

echo ""
echo "4. ambassador"
docker tag quay.io/datawire/ambassador:0.30.1 ${MY_REGISTRY}/quay-io-datawire-ambassador:0.30.1
docker push ${MY_REGISTRY}/quay-io-datawire-ambassador:0.30.1

echo ""
echo "5. redis"
docker tag redis:4.0.1 ${MY_REGISTRY}/redis:4.0.1
docker push ${MY_REGISTRY}/redis:4.0.1

echo ""
echo "6. seldonio/cluster-manager"
docker tag seldonio/cluster-manager:0.1.6 ${MY_REGISTRY}/seldonio-cluster-manager:0.1.6
docker push ${MY_REGISTRY}/seldonio-cluster-manager:0.1.6

echo ""
echo "Finished."
echo ""

然后,可以从阿里云下载到本地,恢复为原始的名称:


echo ""
echo "================================================================="
echo "Pull kubeflow images for system from aliyun.com ..."
echo "This tools created by openthings, NO WARANTY. 2018.11.28."
echo "================================================================="

MY_REGISTRY=registry.cn-hangzhou.aliyuncs.com/openthings

echo ""
echo "1. centraldashboard"
docker pull ${MY_REGISTRY}/kubeflow-images-public-centraldashboard:v0.2.1
docker tag ${MY_REGISTRY}/kubeflow-images-public-centraldashboard:v0.2.1 gcr.io/kubeflow-images-public/centraldashboard:v0.2.1 

echo ""
echo "2. jupyterhub-k8s"
docker pull ${MY_REGISTRY}/kubeflow-jupyterhub-k8s:v20180531-3bb991b1
docker tag ${MY_REGISTRY}/kubeflow-jupyterhub-k8s:v20180531-3bb991b1 gcr.io/kubeflow/jupyterhub-k8s:v20180531-3bb991b1

echo ""
echo "3. tf_operator"
docker pull ${MY_REGISTRY}/kubeflow-images-public-tf_operator:v0.2.0
docker tag ${MY_REGISTRY}/kubeflow-images-public-tf_operator:v0.2.0 gcr.io/kubeflow-images-public/tf_operator:v0.2.0

echo ""
echo "4. ambassador"
docker pull ${MY_REGISTRY}/quay-io-datawire-ambassador:0.30.1
docker tag ${MY_REGISTRY}/quay-io-datawire-ambassador:0.30.1 quay.io/datawire/ambassador:0.30.1

echo ""
echo "5. redis"
docker pull ${MY_REGISTRY}/redis:4.0.1
docker tag ${MY_REGISTRY}/redis:4.0.1 redis:4.0.1

echo ""
echo "6. seldonio/cluster-manager"
docker pull ${MY_REGISTRY}/seldonio-cluster-manager:0.1.6
docker tag ${MY_REGISTRY}/seldonio-cluster-manager:0.1.6 seldonio/cluster-manager:0.1.6

echo ""
echo "Finished."
echo ""

从阿里云下载到本地后,可以推送到本地镜像服务(如Harbor)或者打包为*.tar文件。

2、从本地存储到集群部署

从本地Harbor中安装,使用docker tag将镜像改名后,就可以使用了。可以参考上面的从阿里云下载的方法。

打包为*.tar文件,参见:

echo "==================================================================="
echo "Save Kubeflow images to tar."
echo "This tool created by https://my.oschina.net/u/2306127"
echo "Please visit https://github.com/openthings/kubernetes-tools"

echo "###################################################################"
echo "Kubeflow 0.3.3 ML system images."
echo "-------------------------------------------------------------------"

echo "A1.>> centraldashboard"
docker save gcr.io/kubeflow-images-public/centraldashboard:v0.2.1 -o A1-kubeflow-centraldashboard-v0.2.1.tar
echo ""

echo "A2.>> jupyterhub-k8s"
docker save gcr.io/kubeflow/jupyterhub-k8s:v20180531-3bb991b1 -o A2-kubeflow-jupyterhub-k8s-v20180531-3bb991b1.tar
echo ""

echo "A3.>> tf_operator"
docker save gcr.io/kubeflow-images-public/tf_operator:v0.2.0 -o A3-kubeflow-tf_operator-v0.2.0.tar
echo ""

echo "A4.>> ambassador"
docker save quay.io/datawire/ambassador:0.30.1 -o A4-kubeflow-ambassador-0.30.1.tar
echo ""

echo "A5.>> redis"
docker save redis:4.0.1  -o A5-kubeflow-redis-4.0.1.tar
echo ""

echo "A6.>> seldonio/cluster-manager"
docker save seldonio/cluster-manager:0.1.6 -o A6-kubeflow-seldonio-cluster-manager-0.1.6.tar
echo ""

echo "=================================================================="
echo "Kubeflow worker engine images......"
echo "B1.>> Tensorflow notebook CPU"
docker save gcr.io/kubeflow-images-public/tensorflow-1.12.0-notebook-cpu:v-base-76107ff-897 -o A6-kubeflow-tensorflow-1.12.0-notebook-cpu-v-base-76107ff-897.tar
echo ""

echo "B2.>> Tensorflow notebook GPU"
docker save gcr.io/kubeflow-images-public/tensorflow-1.12.0-notebook-gpu:v-base-76107ff-897 -o A6-kubeflow-tensorflow-1.12.0-notebook-gpu-v-base-76107ff-897.tar
echo ""

echo "==================================================================="
echo "Save Kubeflow images Finished."
echo "This tool created by https://my.oschina.net/u/2306127"
echo "Please visit https://github.com/openthings/kubernetes-tools"
echo "==================================================================="
echo ""

将所有的镜像压缩为一个zip包,然后上传到工作节点:

echo "Uploading 10.1.1.202"
sshpass -p xxxx scp kf-images-0.3.3.zip [email protected]:/home/supermap/

echo "Uploading 10.1.1.203"
sshpass -p xxxx scp kf-images-0.3.3.zip [email protected]:/home/supermap/

echo "Uploading 10.1.1.142"
sshpass -p xxxx scp kf-images-0.3.3.zip [email protected]:/home/supermap/

echo "Uploading 10.1.1.193"
sshpass -p xxxx scp kf-images-0.3.3.zip [email protected]:/home/supermap/

echo "Uploading 10.1.1.234"
sshpass -p xxxx scp kf-images-0.3.3.zip [email protected]:/home/supermap/

echo "Uploading 10.1.1.205"
sshpass -p xxxx scp kf-images-0.3.3.zip [email protected]:/home/supermap/

echo "Uploading 10.1.1.112"
sshpass -p xxxx scp kf-images-0.3.3.zip [email protected]:/home/supermap/

echo "Upload kf-images-0.3.3.zip Finished."

然后,在各个节点上恢复到Docker的原始镜像名称。如下:

echo "==================================================================="
echo "Load Kubeflow images from tar."
echo "This tool created by https://my.oschina.net/u/2306127"
echo "Please visit https://github.com/openthings/kubernetes-tools"

echo "###################################################################"
echo "Kubernetes core system images."
echo "-------------------------------------------------------------------"

echo "A1<< centraldashboard"
sudo docker load -i A1-kubeflow-centraldashboard-v0.2.1.tar
echo ""

echo "A2<< jupyterhub-k8s"
sudo docker load -i A2-kubeflow-jupyterhub-k8s-v20180531-3bb991b1.tar
echo ""

echo "A3<< tf_operator"
sudo docker load -i A3-kubeflow-tf_operator-v0.2.0.tar
echo ""

echo "A4<< ambassador"
sudo docker load -i A4-kubeflow-ambassador-0.30.1.tar
echo ""

echo "A5<< redis"
sudo docker load -i A5-kubeflow-redis-4.0.1.tar
echo ""

echo "A6<< seldonio/cluster-manager"
sudo docker load -i A6-kubeflow-seldonio-cluster-manager-0.1.6.tar
echo ""

echo "=================================================================="
echo "Kubeflow worker engine images......"
echo "B1<< Tensorflow notebook CPU"
sudo docker load -i A6-kubeflow-tensorflow-1.12.0-notebook-cpu-v-base-76107ff-897.tar
echo ""

echo "B2<< Tensorflow notebook GPU"
sudo docker load -i A6-kubeflow-tensorflow-1.12.0-notebook-gpu-v-base-76107ff-897.tar
echo ""

echo "==================================================================="
echo "Load Kubeflow images Finished."
echo "This tool created by https://my.oschina.net/u/2306127"
echo "Please visit https://github.com/openthings/kubernetes-tools"
echo "==================================================================="

在每一个节点执行上面的脚本,也可以使用 ansible来远程批量执行。

ansible all -i hosts_ansible -m shell -a "unzip -u /home/supermap/kf-images-0.3.3.zip && cd /home/supermap/kf-images-0.3.3 && ./kf-images-load.sh" --ask-sudo-pass --become --become-method=sudo

上面的hosts_ansible为ansible的hosts列表文件(请参考 Ansible快速开始-指挥集群 )。

上面的这个过程也适用于Kubernetes本身镜像的下载和更新。更多参考:

查看镜像是否有新的版本:

猜你喜欢

转载自my.oschina.net/u/2306127/blog/2962449