LSF 10.1 Community Edition Installation Guide

(本文参考资料来源多处,感谢贡献者。)

0 Synopsis

LSF社区版,

每cluster支持up to 10台computing node

每node支持up to两个CPU socket

每node支持up to 60 core

每cluster支持up to 2500个run or pending job

1 Environment Details

1.1 Master and Computing Node Details

Node Name

Role

Mark

lsf-master-01

master

lsf-master-02

master replica

node-001

computing

node-002

computing

1.2 Directories on NFS

Directory

Usage

/home/lsf/media

LSF media

/home/lsf/dist

LSF installation

/home/lsf/install_dir

LSF installing tmp dir


 

2 Preparation

2.1 FreeIPA

FreeIPA is used for central user authentication and hosts' DNS resolving. You can also use other similar products, such as NIS.

2.1.1 Create lsfadmin Account.

2.1.2 Join all hosts to FreeIPA in order to resolve all hosts across the cluster.

2.2 Download tarball

Log in to IBMhttps://www-01.ibm.com/marketing/iwm/iwm/web/preLogin.do?source=swerpzsw-lsf-3Login to this site and download the tarball named lsfsce10.2.0.6-x86_64.tar.gz

2.3 ssh key authentication setting

It's recommended to set up ssh key-based password-less authentication.
 

3 Installing lsfce

3.1 untar

# cd /home/lsf/
# mkdir media && cd media
# cp /path/to/lsfsce10.2.0.6-x86_64.tar.gz ./
# tar -zxf lsfsce10.2.0.6-x86_64.tar.gz

3.1.1 create install tmp dir

# mkdir /home/lsf/install_dir
# cd /home/lsf/install_dir
# ln /home/lsf/media/lsfsce10.2.0.6-x86_64/lsf/lsf10.1_linux2.6-glibc2.3-x86_64.tar.Z

3.2 Prepare install.config file

# tar Zxf /home/lsf/media/lsfsce10.2.0.6-x86_64/lsf/lsf10.1_lsfinstall_linux_x86_64.tar.Z -C /home/lsf/install_dir
# cd /home/lsf/install_dir/lsf10.1_lsfinstall
# cp install.config install.config_bak
# cat >> install.config << EOF
LSF_TOP="/home/lsf/dist"
LSF_ADMINS="lsfadmin"
LSF_CLUSTER_NAME="my-lsf-cluster"
LSF_MASTER_LIST="lsf-master-01 lsf-master-02"
LSF_TARDIR="/home/lsf/install_dir"
LSF_ADD_SERVERS="lsf-master-01 lsf-master-02 host-001 host-002"
EOF

3.3 install LSF

3.3.1 directory

# mkdir -p /home/lsf/dist

3.3.2 installing

# cd /home/lsf/install_dir/lsf10.1_lsfinstall
# ./lsfinstall -f install.config

Press Enter to continue viewing the license agreement, or

enter “1” to accept the agreement, “2” to decline it, “3”

to print it, “4” to read non-IBM terms, or “99” to go back

to the previous screen.

Press 1

Searching LSF 10.1 distribution tar files in /usr/share/lsf_distrib Please wait ...

1. linux2.6-glibc2.3-x86_64

Press 1 or Enter to install this host type:

Press 1

4 starting cluster

4.1 Initial LSF environment

 For csh, run on each node,

cat >> /etc/csh.cshrc << EOF
. /home/lsf/dist/conf/cshrc.lsf
EOF

For bash, run on each node,

cat >> /etc/profile << EOF
. /home/lsf/dist/conf/profile.lsf
EOF

4.2 master to node connection args

cat >> /home/lsf/dist/conf/lsf.conf << EOF
LSF_RSH=ssh
EOF

4.3 start cluster

on the master node,

# lsfstartup

4.4 args setting for autorun

run on each compute node,

# /home/lsf/dist/10.1/install/hostsetup --boot="y" --top="/home/lsf/dist"

4.5 check status of the cluster

[root@lsf-master-01 ~]# lsid
IBM Spectrum LSF Community Edition 10.1.0.6, May 25 2018
Copyright IBM Corp. 1992, 2016. All rights reserved.
US Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.

My cluster name is my-lsf-cluster
My master name is lsf-master-01.icinfra.cn
[root@lsf-master-01 ~]# lshosts -w
HOST_NAME                       type       model  cpuf ncpus maxmem maxswp server RESOURCES
lsf-master-01.icinfra.cn      X86_64    Intel_E5  12.5     3   7.7G   7.8G    Yes (mg)
lsf-master-02.icinfra.cn      X86_64    Intel_E5  12.5     3   7.7G   7.8G    Yes (mg)
host-001.icinfra.cn           X86_64    Intel_E5  12.5     3   7.7G   7.8G    Yes (linux)
host-002.icinfra.cn           X86_64    Intel_E5  12.5     3   7.7G   7.8G    Yes (linux)
[root@lsf-master-01 ~]# lsload -w
HOST_NAME               status  r15s   r1m  r15m   ut    pg  ls    it   tmp   swp   mem
lsf-master-01.icinfra.cn     ok   0.0   3.1   2.2  18%   0.0   1     0   44G  7.8G  7.1G
lsf-master-02.icinfra.cn     ok   0.1   1.1   0.8   8%   0.0   1    27   45G  7.8G  7.2G
host-002.icinfra.cn         ok   0.3   1.3   1.0   4%   0.0   1     0   45G  7.8G  7.2G
host-001.icinfra.cn         ok   0.3   1.8   1.5   6%   0.0   1     2   45G  7.8G  7.2G


5 支持的job数=2500个,如图所示

[wanlinwang@computing-host-001 ~]$ foreach i (`seq 2501`)
foreach? echo "Submitting #$i job"
foreach? bsub sleep 3600
foreach? end

猜你喜欢

转载自blog.csdn.net/thesre/article/details/124670733