On several linux build tool for scientific computing environment

There are many tools to make computing related sciences. In addition to use most of the time outside of super-count module environmental management, there are many interesting software. And not everyone can use all the time Supercomputer, Supercomputer also not the only scientific computing hardware solutions. Before I wrote this article, we try to manage the server environment group a year's time, which run through four or five different modes, it can be considered a little experience in the build environment, just to summarize today under share.

I started when built environment, there is a simple pre-build package can be installed directly, not just source code is compiled, upright walking route, but soon discovered that with increasingly complex server database environment, this approach It will make the whole environment variable mix on the unbearable, but also a time to look for a position corresponding to compile libraries installed, and it is very time-consuming.
Later, I found a 4 relatively easy to use and relatively mature server environment management tools (here, not to mention the module, after all, I mainly use the group of a intel6154 workstation, not a cluster).

conda

The first is the Conda , first introduced here in my bag because the manager of the Enlightenment is it. Very easy to use simple, everyday use of python basically covers all common libraries, but also intelpython mystery bonus. Recommended time for different projects to create a new env, not only can make the development environment is relatively independent, but also in the future referral code to others to write their own time, you can easily share dependencies (conda list -e> requirements.yml). For major projects developed using python, highly recommend.

spack

Then is SPACK , which is a platform that supports different package management systems. The main purpose of development is to make it easier to install the required libraries for scientific computing, and supports different compiler / compiler switch between versions, it can be said to be very convenient. spack use logic and conda like, this is my second talk on the spack reasons. spack very easy to install, requires only python environment on the host, then directly follow the instructions on the spack doc on it. spack install can install the library, and the use of certain attribute syntax specified library, such as spack install hdf5% gcc7 on indication SPACK installed gcc compiler hdf5 library, without the addition of the conditions pre-build mirror, the entire installation process can be understood as to download the corresponding dependent libraries and then compiled source package installed in accordance with the compile manner, here simply gives an example, when in actual use there will be more flexible nature. Installed spack library will be stored in / opt directory under spack_root in the / opt directory linux daily use similar, install the library hash Hash, easy to find and distinguished from each other. When in use, we can directly use the library name --Tab viewing choices, or just use the hash value library similar module load operation (here spack load). So although there spack conda env create similar functionality, but spack load to meet the needs of the most under normal circumstances, no special use. But I have to pour cold water splash, spack although useful function, but in actual use, the library installation time is very impressive, after all, and a similar source installation process, build time accounted for a very long section, it is strongly recommended use pre-build mirror. But sometimes there is no mirror corresponding library, it can only install up slowly. Further, spakc stable mounting for the network is relatively high, but also within the experience poor Ward, preferably with the appropriate tools.

docker/singularity

The next one is Docker / Singularity . The former bigwigs may have heard of, docker is an independent process-level environment, early linux is implemented by the container, so you can use the GPU linux machine. However, the win and mac machine, and the drive compartment container prior docker layer of the virtual machine, can not be used directly or GPU acceleration devices. Do not ask why mac also rely on one virtual machine, I can only say that not all meet the standard Unix is a good os (joke). docker is generally best environment configuration scheme, as long as can be found on the corresponding Image docker docker Hub, direct pull down, then the container can directly start the test environment. Because science is for GPU computing applications containing only host high drive, it can be backward compatible with some cuda version, then use a different docker cuda management environment is the most economical way. Before I do not understand the application docker in host upload multiple versions of cuda, led directly to the host drive gg, this is the lesson of blood! But docker there is not ok place, first of all is too much trouble to use, easy image and container innocently tell at the beginning of the operation, causing some errors. The other is the docker bind folder permissions when there will be a problem, the problem for general use is acceptable, but still quite cumbersome to use for a long time, so I recommend using this singularity, it can be seen as a super-computing environment to build docker, support mpi / openmp and other parallel libraries better, but also supports the use of the environment in the singularity $ user environment, reduce a lot of file permissions problem, the feeling after use in scientific computing on the scene are better choices. Also, many operations singularity is file-oriented, we often just need it in images marketSif directly download files (of course, you can also use the singularity pull, but the speed you know) and then directly singularity shell * .sif can enter the environment, and with the --writable option can be modified container environment, you can also complete the container environment after re-build a new modification sif file for distribution environment. Which is quite inconvenient. And singularity and spack similar, does not require administrator privileges, you can compile your own installation, and operating environment, allowing Supercomputer users also get more flexible environmental control. (docker is impossible, because it requires root start a daemon!)
in the future of scientific computing applications, my inclination is to use a container technology behind to do the distribution environment. After all, for many scientific mode / program, library dependencies complex, many researchers are dedicated to the youth and hair boring work environment to build, with the container this weapon, only when the release also released the corresponding version of the corresponding calculation code the images can be, even for non-heterogeneous project can do multi-platform distribution, efficiency is greatly increased flow of scientific research. His self - made a flag, 'origin -flag ->' !

Guess you like

Origin www.cnblogs.com/gabriel-sun/p/12128373.html