我的AI之路(35)--使用tensorflow和pytorch的docker镜像

从docker远程仓库拉取自己想要的镜像，首先最好查看docker镜像的版本号TAG，以确认版本是自己想要的，查看docker镜像的版本号，需先打开网页：

https://hub.docker.com/r/library/

然后左上角输入名字搜索想要的镜像，比如tensorflow:

在列出的结果中点击进入相应的主页后，点击Tags标签，查找自己想要的版本,然后点击右边相应的复制按钮复制下对应的docker pull完整命令:

然后执行这个命令即可下载这个image并装入本地库中，例如我选择的是tensorflow2.0 gpu版，针对python3且带jupyter的版本：

#####################################################################
#使用Tensorflow2.0.0-gpu-py3-jupyter镜像
#####################################################################

docker pull tensorflow/tensorflow:2.0.0-gpu-py3-jupyter

...
Digest: sha256:613cdca993785f7c41c744942871fc5358bc0110f6f5cb5b00a4b459356d55e4
Status: Downloaded newer image for tensorflow/tensorflow:2.0.0-gpu-py3-jupyter
docker.io/tensorflow/tensorflow:2.0.0-gpu-py3-jupyter

#创建并运行容器,还可以使用类似 --env NVIDIA_VISIBLE_DEVICES=0,1这样的选项指定哪些GPU可见

docker run --runtime=nvidia -d -it -p 8888:8888 tensorflow/tensorflow:2.0.0-gpu-py3-jupyter bash

#下面配置jupyter:

jupyter notebook --generate-config
#generate file under /home/USERNAME/.jupyter/jupyter_notebook_config.py

As of notebook 5.3, the first time you log-in using a token, the notebook server should give you the opportunity to setup a password from the user interface.
You will be presented with a form asking for the current _token_, as well as your _new_ _password_ ; enter both and click on Login and setup new password.

#修改jupyter口令
jupyter notebook password
Enter password: ****
Verify password: ****
[NotebookPasswordApp] Wrote hashed password to /Users/you/.jupyter/jupyter_notebook_config.json

#查询容器的id:

docker ps |grep tensorflow
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
5bb3b5deb320 tensorflow/tensorflow:2.0.0-gpu-py3-jupyter "bash -c 'source /et…" About an hour ago Up 6 seconds 0.0.0.0:8888->8888/tcp

#重启container:
docker restart 5bb3b5deb320

#至此jupyter到此可连了，浏览器里输入下面的地址：
https://192.168.1.205:8888

#####################################################################
#使用PyTorch1.3.1镜像
#####################################################################
#目前latest等同于1.3-cuda10.1-cudnn7-runtime，根据需要选择pull RT版或者Dev版，我使用的runtime:
docker pull pytorch/pytorch:latest
#runtime version
docker pull pytorch/pytorch:1.3-cuda10.1-cudnn7-runtime
#develop version
docker pull pytorch/pytorch:1.3-cuda10.1-cudnn7-devel

nvidia-docker run -dit --name pytorch1.3 --env NVIDIA_VISIBLE_DEVICES=2 -p 8888:8888 pytorch/pytorch:1.3-cuda10.1-cudnn7-runtime

顺带说一下一个包含了多个不同深度学习框架的docker image叫Deepo，里面安装的框架不一定都是最新版的，可以根据需要下载使用：

### Deepo 包含多种框架的深度学习环境 ################################
docker pull ufoym/deepo

#国内的镜像网站下载
docker pull registry.docker-cn.com/ufoym/deepo
docker pull hub-mirror.c.163.com/ufoym/deepo
docker pull docker.mirrors.ustc.edu.cn/ufoym/deepo

#测试

docker run --gpus all --rm ufoym/deepo nvidia-smi

#运行容器

docker run --gpus all -it ufoym/deepo bash
docker run --gpus all -it -v /<home>/data:/data -v /<home>/config:/config ufoym/deepo bash

#只获取某个框架的image:
docker pull ufoym/deepo:tensorflow
#使用Jupyter notebook
docker run --gpus all -it -p 8888:8888 --ipc=host ufoym/deepo jupyter notebook --no-browser --ip=0.0.0.0 --allow-root --NotebookApp.token

#解决pytorch训练模型过程中可能出现的共享内存不足的问题的办法：
####Please note that some frameworks (e.g. PyTorch) use shared memory to share data between processes,
####so if multiprocessing is used the default shared memory segment size that container runs with is not enough,
####and you should increase shared memory size either with --ipc=host or --shm-size command line options to docker run:
docker run --gpus all -it --ipc=host ufoym/deepo bash

Arnold-FY-Chen

发布了61 篇原创文章 · 获赞 90 · 访问量 11万+

私信关注

我的AI之路(35)--使用tensorflow和pytorch的docker镜像

猜你喜欢