Preface
This tutorial is only for students whose domestic servers cannot link to gcr.io
the Google library. Students whose server environment can directly access gcr.io can bypass it.
This tutorial is not only to solve this problem, but also to provide you with an idea. If you encounter similar problems, you will know how to start troubleshooting. Of course, some masters can read through the source code, so I am just a fool. .
Blogger kubeflow manifests installation, version: 1.6.1, the latest version.
Problem Description
gcr.io
I believe that everyone can successfully install the official version of kubeflow even if it is inaccessible . You should all understand that a large part of the container images on this platform come from gcr.io
. As for how to ultimately solve the image problem, I believe everyone has their own tips (anyway, I believe you There is a way to get gcr.io
the image of the library).
After going through all kinds of hardships and hardships, I saw that all the pods were running. I was very excited and took the time to log in to the web.
oh? A demo was also provided? Try it out:
After creating it and running it, you found: Huh? Why is it stuck?
Click here to take a look:
This step is in Pending state with this message: ImagePullBackOff: Back-off pulling image "gcr.io/google-containers/busybox"
Okay, another gcr.io
pot.
I have experience with this. I have installed the entire kubeflow, but I still need a busybox image? Routine process, get the image, load it to the server, delete the pod, wait for initialization... After a while... Should I go? Why is this error still happening? There is evil spirit!
Click on the pod and take a look: Anyone
who plays K8S will understand. The pull strategy is that they don’t use your local image at all, they just go to the image library to pull the latest one. Then I'll go to the server kubectl edit pod headquarters and I'll change it for you . Feel sorry! This pod is very arrogant and you are not allowed to change it. Ah... I can't connect to the mirror repository, and you won't let me change the pull strategy. I also can't change the mirror repository. What should I do?Always
IfNotPresent
solution
Since it is a pipeline, it must be created based on what, the configuration file? Pipeline file? As long as it is configured, it is well documented. First, I looked through the source code of this pipeline. The pipeline source code address is in the demo description. After looking over and over, I found no relevant description
in the source code and its related files . That means this is the native configuration of kubeflow, not the pipeline. gcr.io/google-containers/busybox
, then go through the source code, and the result is really in the kubeflow file
. This is a configmap resource file namedpipeline-install-config
Then we go to the server and look at this configmap
kubectl edit configmaps pipeline-install-config -n kubeflow
# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: v1
data:
ConMaxLifeTime: 120s
appName: pipeline
appVersion: 2.0.0-alpha.5
autoUpdatePipelineDefaultVersion: "true"
bucketName: mlpipeline
cacheDb: cachedb
cacheImage: gcr.io/google-containers/busybox
cacheNodeRestrictions: "false"
cronScheduleTimezone: UTC
dbHost: mysql
dbPort: "3306"
defaultPipelineRoot: ""
mlmdDb: metadb
pipelineDb: mlpipeline
warning: |
1. Do not use kubectl to edit this configmap, because some values are used
during kustomize build. Instead, change the configmap and apply the entire
kustomize manifests again.
2. After updating the configmap, some deployments may need to be restarted
until the changes take effect. A quick way to restart all deployments in a
namespace: `kubectl rollout restart deployment -n <your-namespace>`.
kind: ConfigMap
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"v1","data":{"ConMaxLifeTime":"120s","appName":"pipeline","appVersion":"2.0.0-alpha.5","autoUpdatePipelineDefaultVersion":"true","bucketName":"mlpipeline","cacheDb":"cachedb","cacheImage":"gcr.io/google-containers/busybox","cacheNodeRestrictions":"false","cronScheduleTimezone":"UTC","dbHost":"mysql","dbPort":"3306","defaultPipelineRoot":"","mlmdDb":"metadb","pipelineDb":"mlpipeline","warning":"1. Do not use kubectl to edit this configmap, because some values are used\nduring kustomize build. Instead, change the configmap and apply the entire\nkustomize manifests again.\n2. After updating the configmap, some deployments may need to be restarted\nuntil the changes take effect. A quick way to restart all deployments in a\nnamespace: `kubectl rollout restart deployment -n \u003cyour-namespace\u003e`.\n"},"kind":"ConfigMap","metadata":{"annotations":{},"labels":{"app.kubernetes.io/component":"ml-pipeline","app.kubernetes.io/name":"kubeflow-pipelines","application-crd-id":"kubeflow-pipelines"},"name":"pipeline-install-config","namespace":"kubeflow"}}
creationTimestamp: "2022-10-25T08:51:08Z"
labels:
app.kubernetes.io/component: ml-pipeline
app.kubernetes.io/name: kubeflow-pipelines
application-crd-id: kubeflow-pipelines
name: pipeline-install-config
namespace: kubeflow
resourceVersion: "16139565"
uid: 9ce56da0-48f1-497b-897d-876ffc974892
Now that we have found this, the rest is relatively simple. First of all, we have got the native image. gcr.io/google-containers/busybox
I only need to tag this image, upload it to my own image library, and then change the image in the configmap to my own image. .
Do you think the pipeline can be run after these are completed? No, you also need to check which deployments have loaded this configuration file. After
sorting it out:
ml-pipeline-scheduledworkflow
ml-pipeline
metadata-grpc-deployment
kubeflow-pipelines-profile-controller
cache-server
Then open one and take a look. Well, it is loaded in the way of env.
Students who are familiar with configmap should know that the contents of the configmap are loaded in the way of environment variables. , the new env will not take effect until the pod is restarted.
Restart these pods one by one.
Okay, let’s run the web interface and take a look.