这是一本书！！！

一本写我在容器生态圈的所学！！！

重点先知：

1. centos 7.6安装优化

2. k8s 1.15.1 高可用部署

3. 网络插件calico

4. dashboard 插件

5. metrics-server 插件

6. kube-state-metrics 插件

原文分享：http://note.youdao.com/noteshare?id=c9f647765493d11099a939d7e5e102c9&sub=A837AA253CA54660AABADEF435A40714

第1章从零开始

1.1 前言

一直想写点内容，来记录我在IT这条路上的旅程。这个念头持续了很久。终于在2019年的7月21日成行。

我将我在IT这条路上的所学、所做、所闻当作旅途中的所看、所听、所感，一一记录下来。

IT是一条不归路。高手之上还有高手。而我单单的希望和越来越强的前辈过招。

我将我的IT方向，轨到容器开发。容器是容器生态圈的简称，开发是Go语言开发的简称。

我个人认为运维的趋势是容器化运维，开发的趋势是容器化开发。所以我走的是容器开发的路。

今年是相对悠闲的一年，可以沉下心来做两件大事：1. 我的容器生态圈之旅 2.Go语言从小白到大神之旅

我希望的是用时6个月初步达到容器开发的级别，因为我具备一定的基础，应该还是可以的。

我希望的是在2020年的5月份时，可以初步完成这两件大事。

I can do it because I'm young ！

笔落心坚。拭目以待。

1.2 内容介绍

容器引擎：docker
容器编排：kubernetes
容器存储：ceph
容器监控：prometheus
日志分析：elk
服务网络: istio

1.3 资源

所需软件分享链接：链接：https://pan.baidu.com/s/1IvUG_hdqDvReDJS9O1k9OA 提取码：7wfh

内容来源：官网、博文、其他

1.3.1 物理机

硬件性能

从上图可以看出来：硬件内存达到24.0GB，所以可以支持开启众多虚拟机，更有效的模拟真实生成环境。

1.3.2 虚拟机工具

VMware Workstation Pro 14

VMware Workstation（中文名“威睿工作站”）是一款功能强大的桌面虚拟计算机软件，提供用户可在单一的桌面上同时运行不同的操作系统，和进行开发、测试、部署新的应用程序的最佳解决方案。VMware Workstation可在一部实体机器上模拟完整的网络环境，以及可便于携带的虚拟机器，其更好的灵活性与先进的技术胜过了市面上其他的虚拟计算机软件。对于企业的 IT开发人员和系统管理员而言， VMware在虚拟网路，实时快照，拖曳共享文件夹，支持 PXE 等方面的特点使它成为必不可少的工具。

VMware Workstation允许操作系统(OS)和应用程序(Application)在一台虚拟机内部运行。虚拟机是独立运行主机操作系统的离散环境。在 VMware Workstation 中，你可以在一个窗口中加载一台虚拟机，它可以运行自己的操作系统和应用程序。你可以在运行于桌面上的多台虚拟机之间切换，通过一个网络共享虚拟机(例如一个公司局域网)，挂起和恢复虚拟机以及退出虚拟机，这一切不会影响你的主机操作和任何操作系统或者其它正在运行的应用程序。

1.3.3 远程链接工具

Xshell是一个强大的安全终端模拟软件，它支持SSH1, SSH2, 以及Microsoft Windows 平台的TELNET 协议。Xshell 通过互联网到远程主机的安全连接以及它创新性的设计和特色帮助用户在复杂的网络环境中享受他们的工作。

Xshell可以在Windows界面下用来访问远端不同系统下的服务器，从而比较好的达到远程控制终端的目的。除此之外，其还有丰富的外观配色方案以及样式选择。

1.4 虚机

1.4.1 centos 7.6 系统安装

1.4.2 模板机优化

查看系统版本和内核

[root@mobanji ~]# cat /etc/redhat-release
CentOS Linux release 7.6.1810 (Core)
[root@mobanji ~]# uname -r
3.10.0-957.el7.x86_64

别名设置

#进入网络配置文件
[root@mobanji ~]# yum install -y vim
[root@mobanji ~]# alias vimn="vim /etc/sysconfig/network-scripts/ifcfg-eth0"
[root@mobanji ~]# vim ~/.bashrc
alias vimn="vim /etc/sysconfig/network-scripts/ifcfg-eth0"

网络优化

[root@mobanji ~]# vimn
TYPE=Ethernet
BOOTPROTO=none
NAME=eth0
DEVICE=eth0
ONBOOT=yes
IPADDR=20.0.0.5
PREFIX=24
GATEWAY=20.0.0.2
DNS1=233.5.5.5
DNS2=8.8.8.8
DNS3=119.29.29.29
DNS4=114.114.114.114

更新yum源及必要软件安装

[root@mobanji ~]# yum install -y wget
[root@mobanji ~]# cp -r /etc/yum.repos.d /etc/yum.repos.d.bak
[root@mobanji ~]# rm -f /etc/yum.repos.d/*.repo
[root@mobanji ~]# wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-7.repo \
 && wget -O /etc/yum.repos.d/epel.repo http://mirrors.aliyun.com/repo/epel-7.repo
[root@mobanji ~]# yum clean all && yum makecache
[root@mobanji ~]# yum install  bash-completion lrzsz  nmap  nc  tree  htop iftop  net-tools ntpdate  lsof screen tcpdump conntrack ntp ipvsadm ipset jq sysstat libseccomp nmon iptraf mlocate strace nethogs iptraf iftop bridge-utils bind-utils nc nfs-tuils rpcbind dnsmasq python python-devel tree telnet git sshpass bind-utils -y

配置时间

#配置时间同步
[root@mobanji ~]# ntpdate -u pool.ntp.org
[root@mobanji ~]# crontab -e
#dingshi time
*/15 * * * * /usr/sbin/ntpdate -u pool.ntp.org >/dev/null 2>&1

#调整系统TimeZone
[root@mobanji ~]# timedatectl set-timezone Asia/Shanghai

#将当前的 UTC 时间写入硬件时钟
[root@mobanji ~]# timedatectl set-local-rtc 0


# 重启依赖于系统时间的服务
[root@mobanji ~]# systemctl restart rsyslog
[root@mobanji ~]# systemctl restart crond

ssh优化

[root@mobanji ~]# sed  -i  '79s@GSSAPIAuthentication yes@GSSAPIAuthentication no@;115s@#UseDNS yes@UseDNS no@' /etc/ssh/sshd_config
[root@mobanji ~]# systemctl restart sshd

关闭防火墙和SElinux

#关闭防火墙，清理防火墙规则，设置默认转发策略
[root@mobanji ~]# systemctl stop firewalld
[root@mobanji ~]# systemctl disable firewalld
Removed symlink /etc/systemd/system/multi-user.target.wants/firewalld.service.
Removed symlink /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service.
[root@mobanji ~]# iptables -F && iptables -X && iptables -F -t nat && iptables -X -t nat
[root@mobanji ~]# iptables -P FORWARD ACCEPT
[root@mobanji ~]#  firewall-cmd --state
not running

关闭SELinux，否则后续K8S挂载目录时可能 setenforce 0
报错 Permission denied
[root@mobanji ~]# setenforce 0
[root@mobanji ~]# sed -i 's/^SELINUX=.*/SELINUX=disabled/' /etc/selinux/config

关闭无关的服务

[root@mobanji ~]# systemctl list-unit-files |grep "enabled"
[root@mobanji ~]#  systemctl status postfix &&  systemctl stop postfix && systemctl disable postfix

设置limits.conf

[root@mobanji ~]# cat >> /etc/security/limits.conf <<EOF
# End of file
* soft nofile 65525
* hard nofile 65525
* soft nproc 65525
* hard nproc 65525
EOF

升级系统内核

CentOS 7.x系统自带的3.10.x内核存在一些Bugs，导致运行的Docker、Kubernetes不稳定，例如：

-> 高版本的 docker(1.13 以后) 启用了3.10 kernel实验支持的kernel memory account功能(无法关闭)，当节点压力大如频繁启动和停止容器时会导致 cgroup memory leak；

-> 网络设备引用计数泄漏，会导致类似于报错："kernel:unregister_netdevice: waiting for eth0 to become free. Usage count = 1";

解决方案如下：

-> 升级内核到 4.4.X 以上；

-> 或者，手动编译内核，disable CONFIG_MEMCG_KMEM 特性；

-> 或者安装修复了该问题的 Docker 18.09.1 及以上的版本。但由于 kubelet 也会设置 kmem（它 vendor 了 runc），所以需要重新编译 kubelet 并指定 GOFLAGS="-tags=nokmem"；

[root@mobanji ~]# uname -r
3.10.0-957.el7.x86_64
[root@mobanji ~]# yum update -y
[root@mobanji ~]# rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
[root@mobanji ~]# rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-3.el7.elrepo.noarch.rpm
[root@mobanji ~]# yum --disablerepo="*" --enablerepo="elrepo-kernel" list available
kernel-lt.x86_64                               4.4.185-1.el7.elrepo              elrepo-kernel     <---长期文档版
......
kernel-ml.x86_64                               5.2.1-1.el7.elrepo                elrepo-kernel     <---最新主线稳定版
......

#安装内核源文件
[root@mobanji ~]# yum --enablerepo=elrepo-kernel install kernel-lt-devel kernel-lt -y

为了让新安装的内核成为默认启动选项
需要如下修改 GRUB 配置,打开并编辑 /etc/default/grub 并设置 GRUB_DEFAULT=0
意思是 GRUB 初始化页面的第一个内核将作为默认内核.

#查看默认启动顺序
[root@mobanji ~]# awk -F\' '$1=="menuentry " {print $2}' /etc/grub2.cfg
CentOS Linux (4.4.185-1.el7.elrepo.x86_64) 7 (Core)
CentOS Linux (3.10.0-957.21.3.el7.x86_64) 7 (Core)
CentOS Linux (3.10.0-957.el7.x86_64) 7 (Core)
CentOS Linux (0-rescue-b4c601a613824f9f827cb9787b605efb) 7 (Core)

由上面可以看出新内核(4.4.185)目前位置在0，原来的内核(3.10.0)目前位置在1，所以如果想生效最新的内核，还需要我们修改内核的启动顺序为0

#编辑/etc/default/grub文件
[root@mobanji ~]# vim /etc/default/grub
GRUB_DEFAULT=0   <--- saved改为0
#运行grub2-mkconfig命令来重新创建内核配置
#重启系统
[root@mobanji ~]# reboot

关闭NUMA

[root@mobanji ~]# cp /etc/default/grub{,.bak}
[root@mobanji ~]# vim /etc/default/grub   
.........
GRUB_CMDLINE_LINUX="...... numa=off"      # 即添加"numa=0ff"内容
     
重新生成 grub2 配置文件：
# cp /boot/grub2/grub.cfg{,.bak}
# grub2-mkconfig -o /boot/grub2/grub.cfg

设置rsyslogd 和systemd journald

systemd 的 journald 是 Centos 7 缺省的日志记录工具，它记录了所有系统、内核、Service Unit 的日志。相比 systemd，journald 记录的日志有如下优势：

-> 可以记录到内存或文件系统；(默认记录到内存，对应的位置为 /run/log/jounal)；

-> 可以限制占用的磁盘空间、保证磁盘剩余空间；

-> 可以限制日志文件大小、保存的时间；

-> journald 默认将日志转发给 rsyslog，这会导致日志写了多份，/var/log/messages 中包含了太多无关日志，不方便后续查看，同时也影响系统性能。

[root@mobanji ~]# mkdir /var/log/journal     <---#持久化保存日志的目录
[root@mobanji ~]# mkdir /etc/systemd/journald.conf.d
[root@mobanji ~]# cat > /etc/systemd/journald.conf.d/99-prophet.conf <<EOF
> [Journal]
> # 持久化保存到磁盘
> Storage=persistent
>      
> # 压缩历史日志
> Compress=yes
>      
> SyncIntervalSec=5m
> RateLimitInterval=30s
> RateLimitBurst=1000
>      
> # 最大占用空间 10G
> SystemMaxUse=10G
>      
> # 单日志文件最大 200M
> SystemMaxFileSize=200M
>      
> # 日志保存时间 2 周
> MaxRetentionSec=2week
>      
> # 不将日志转发到 syslog
> ForwardToSyslog=no
> EOF
[root@mobanji ~]# systemctl restart systemd-journald
[root@mobanji ~]# systemctl status systemd-journald

加载内核模块

[root@mobanji ~]# cat > /etc/sysconfig/modules/ipvs.modules <<EOF
> #!/bin/bash
> modprobe -- ip_vs
> modprobe -- ip_vs_rr
> modprobe -- ip_vs_wrr
> modprobe -- ip_vs_sh
> modprobe -- nf_conntrack_ipv4
> modprobe -- br_netfilter
> EOF
/etc/sysconfig/modules/ipvs.modules[root@mobanji ~]#
[root@mobanji ~]# chmod 755 /etc/sysconfig/modules/ipvs.modules && bash /etc/sysconfig/modules/ipvs.modules
[root@mobanji ~]# lsmod  | grep br_netfilter
br_netfilter           22256  0
bridge                151336  1 br_netfilter

优化内核参数

[root@mobanji ~]# cat << EOF | tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables=1
net.bridge.bridge-nf-call-ip6tables=1
net.ipv4.ip_forward=1
net.ipv4.tcp_tw_recycle=0  #由于tcp_tw_recycle与kubernetes的NAT冲突，必须关闭！否则会导致服务不通。
vm.swappiness=0            #禁止使用 swap 空间，只有当系统 OOM 时才允许使用它
vm.overcommit_memory=1     #不检查物理内存是否够用
vm.panic_on_oom=0          #开启 OOM
fs.inotify.max_user_instances=8192
fs.inotify.max_user_watches=1048576
fs.file-max=52706963
fs.nr_open=52706963
net.ipv6.conf.all.disable_ipv6=1  #关闭不使用的ipv6协议栈，防止触发docker BUG.
net.netfilter.nf_conntrack_max=2310720
EOF
[root@mobanji ~]# sysctl -p /etc/sysctl.d/k8s.conf

注：

必须关闭 tcp_tw_recycle，否则和 NAT 冲突，会导致服务不通；

关闭 IPV6，防止触发 docker BUG；

个性vim配置

https://blog.csdn.net/zisefeizhu/article/details/89407487

[root@mobanji ~]# cat ~/.vimrc
set nocompatible
filetype on
set paste
set rtp+=~/.vim/bundle/Vundle.vim
call vundle#begin()
 
 
" 这里根据自己需要的插件来设置，以下是我的配置 "
"
" YouCompleteMe:语句补全插件
set runtimepath+=~/.vim/bundle/YouCompleteMe
autocmd InsertLeave * if pumvisible() == 0|pclose|endif "离开插入模式后自动关闭预览窗口"
let g:ycm_collect_identifiers_from_tags_files = 1           " 开启 YCM基于标签引擎
let g:ycm_collect_identifiers_from_comments_and_strings = 1 " 注释与字符串中的内容也用于补全
let g:syntastic_ignore_files=[".*\.py$"]
let g:ycm_seed_identifiers_with_syntax = 1                  " 语法关键字补全
let g:ycm_complete_in_comments = 1
let g:ycm_confirm_extra_conf = 0                            " 关闭加载.ycm_extra_conf.py提示
let g:ycm_key_list_select_completion = ['<c-n>', '<Down>']  " 映射按键,没有这个会拦截掉tab, 导致其他插件的tab不能用.
let g:ycm_key_list_previous_completion = ['<c-p>', '<Up>']
let g:ycm_complete_in_comments = 1                          " 在注释输入中也能补全
let g:ycm_complete_in_strings = 1                           " 在字符串输入中也能补全
let g:ycm_collect_identifiers_from_comments_and_strings = 1 " 注释和字符串中的文字也会被收入补全
let g:ycm_global_ycm_extra_conf='~/.vim/bundle/YouCompleteMe/third_party/ycmd/cpp/ycm/.ycm_extra_conf.py'
let g:ycm_show_diagnostics_ui = 0                           " 禁用语法检查
inoremap <expr> <CR> pumvisible() ? "\<C-y>" : "\<CR>"             " 回车即选中当前项
nnoremap <c-j> :YcmCompleter GoToDefinitionElseDeclaration<CR>     " 跳转到定义处
let g:ycm_min_num_of_chars_for_completion=2                 " 从第2个键入字符就开始罗列匹配项
"
 
 
 
" github 仓库中的插件 "
Plugin 'VundleVim/Vundle.vim'
 
 
Plugin 'vim-airline/vim-airline'
"vim-airline配置:优化vim界面"
"let g:airline#extensions#tabline#enabled = 1
" airline设置
" 显示颜色
set t_Co=256
set laststatus=2
" 使用powerline打过补丁的字体
let g:airline_powerline_fonts = 1
" 开启tabline
let g:airline#extensions#tabline#enabled = 1
" tabline中当前buffer两端的分隔字符
let g:airline#extensions#tabline#left_sep = ' '
" tabline中未激活buffer两端的分隔字符
let g:airline#extensions#tabline#left_alt_sep = ' '
" tabline中buffer显示编号
let g:airline#extensions#tabline#buffer_nr_show = 1
" 映射切换buffer的键位
nnoremap [b :bp<CR>
nnoremap ]b :bn<CR>
" 映射<leader>num到num buffer
map <leader>1 :b 1<CR>
map <leader>2 :b 2<CR>
map <leader>3 :b 3<CR>
map <leader>4 :b 4<CR>
map <leader>5 :b 5<CR>
map <leader>6 :b 6<CR>
map <leader>7 :b 7<CR>
map <leader>8 :b 8<CR>
map <leader>9 :b 9<CR>
 
 
 
" vim-scripts 中的插件 "
Plugin 'taglist.vim'
"ctags 配置:F3快捷键显示程序中的各种tags，包括变量和函数等。
map <F3> :TlistToggle<CR>
let Tlist_Use_Right_Window=1
let Tlist_Show_One_File=1
let Tlist_Exit_OnlyWindow=1
let Tlist_WinWidt=25
 
Plugin 'The-NERD-tree'
"NERDTree 配置:F2快捷键显示当前目录树
map <F2> :NERDTreeToggle<CR>
let NERDTreeWinSize=25
 
Plugin 'indentLine.vim'
Plugin 'delimitMate.vim'
 
" 非 github 仓库的插件"
" Plugin 'git://git.wincent.com/command-t.git'
" 本地仓库的插件 "
 
call vundle#end()
 
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
"""""新文件标题
""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
"新建.c,.h,.sh,.java文件，自动插入文件头
autocmd BufNewFile *.sh,*.yaml exec ":call SetTitle()"
""定义函数SetTitle，自动插入文件头
func SetTitle()
"如果文件类型为.sh文件
if &filetype == 'sh'
call setline(1, "##########################################################################")
        call setline(2,"#Author:                     zisefeizhu")
        call setline(3,"#QQ:                         2********0")
        call setline(4,"#Date:                       ".strftime("%Y-%m-%d"))
        call setline(5,"#FileName:                   ".expand("%"))
        call setline(6,"#URL:                        https://www.cnblogs.com/zisefeizhu/")
        call setline(7,"#Description:                The test script")                         
        call setline(8,"#Copyright (C):              ".strftime("%Y")." All rights reserved")
call setline(9, "##########################################################################")
call setline(10, "#!/bin/bash")
call setline(11,"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin")
call setline(12, "export $PATH")
call setline(13, "")
endif
if &filetype == 'yaml'
call setline(1, "##########################################################################")
        call setline(2,"#Author:                     zisefeizhu")
        call setline(3,"#QQ:                         2********0")
        call setline(4,"#Date:                       ".strftime("%Y-%m-%d"))
        call setline(5,"#FileName:                   ".expand("%"))
        call setline(6,"#URL:                        https://www.cnblogs.com/zisefeizhu/")
        call setline(7,"#Description:                The test script")                                                 
        call setline(8,"#Copyright (C):              ".strftime("%Y")." All rights reserved")
call setline(9, "###########################################################################")
call setline(10, "")
endif
"新建文件后，自动定位到文件末尾
autocmd BufNewFile * normal G
endfunc
""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
"键盘命令
""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
 
nmap <leader>w :w!<cr>
nmap <leader>f :find<cr>
 
" 映射全选+复制 ctrl+a
map <C-A> ggVGY
map! <C-A> <Esc>ggVGY
map <F12> gg=G
" 选中状态下 Ctrl+c 复制
vmap <C-c> "+y
 
""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
""实用设置
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
" 设置当文件被改动时自动载入
set autoread
" quickfix模式
autocmd FileType c,cpp map <buffer> <leader><space> :w<cr>:make<cr>
"代码补全
set completeopt=preview,menu
"允许插件  
filetype plugin on
"共享剪贴板  
set clipboard=unnamed
"从不备份  
set nobackup
"make 运行
:set makeprg=g++\ -Wall\ \ %
"自动保存
set autowrite
set ruler                   " 打开状态栏标尺
set cursorline              " 突出显示当前行
set magic                   " 设置魔术
set guioptions-=T           " 隐藏工具栏
set guioptions-=m           " 隐藏菜单栏
"set statusline=\ %<%F[%1*%M%*%n%R%H]%=\ %y\ %0(%{&fileformat}\ %{&encoding}\ %c:%l/%L%)\
" 设置在状态行显示的信息
set foldcolumn=0
set foldmethod=indent
set foldlevel=3
set foldenable              " 开始折叠
" 不要使用vi的键盘模式，而是vim自己的
set nocompatible
" 语法高亮
set syntax=on
" 去掉输入错误的提示声音
set noeb
" 在处理未保存或只读文件的时候，弹出确认
set confirm
" 自动缩进
set autoindent
set cindent
" Tab键的宽度
set tabstop=2
" 统一缩进为2
set softtabstop=2
set shiftwidth=2
" 不要用空格代替制表符
set noexpandtab
" 在行和段开始处使用制表符
set smarttab
" 显示行号
" set number
" 历史记录数
set history=1000
"禁止生成临时文件
set nobackup
set noswapfile
"搜索忽略大小写
set ignorecase
"搜索逐字符高亮
set hlsearch
set incsearch
"行内替换
set gdefault
"编码设置
set enc=utf-8
set fencs=utf-8,ucs-bom,shift-jis,gb18030,gbk,gb2312,cp936
"语言设置
set langmenu=zh_CN.UTF-8
set helplang=cn
" 我的状态行显示的内容（包括文件类型和解码）
set statusline=%F%m%r%h%w\ [FORMAT=%{&ff}]\ [TYPE=%Y]\ [POS=%l,%v][%p%%]\ %{strftime(\"%d/%m/%y\ -\ %H:%M\")}
set statusline=[%F]%y%r%m%*%=[Line:%l/%L,Column:%c][%p%%]
" 总是显示状态行
set laststatus=2
" 命令行（在状态行下）的高度，默认为1，这里是2
set cmdheight=2
" 侦测文件类型
filetype on
" 载入文件类型插件
filetype plugin on
" 为特定文件类型载入相关缩进文件
filetype indent on
" 保存全局变量
set viminfo+=!
" 带有如下符号的单词不要被换行分割
set iskeyword+=_,$,@,%,#,-
" 字符间插入的像素行数目
set linespace=0
" 增强模式中的命令行自动完成操作
set wildmenu
" 使回格键（backspace）正常处理indent, eol, start等
set backspace=2
" 允许backspace和光标键跨越行边界
set whichwrap+=<,>,h,l
" 可以在buffer的任何地方使用鼠标（类似office中在工作区双击鼠标定位）
set mouse=a
set selection=exclusive
set selectmode=mouse,key
" 通过使用: commands命令，告诉我们文件的哪一行被改变过
set report=0
" 在被分割的窗口间显示空白，便于阅读
set fillchars=vert:\ ,stl:\ ,stlnc:\
" 高亮显示匹配的括号
set showmatch
" 匹配括号高亮的时间（单位是十分之一秒）
set matchtime=1
" 光标移动到buffer的顶部和底部时保持3行距离
set scrolloff=3
" 为C程序提供自动缩进
set smartindent
" 高亮显示普通txt文件（需要txt.vim脚本）
 au BufRead,BufNewFile *  setfiletype txt
"自动补全
:inoremap ( ()<ESC>i
:inoremap ) <c-r>=ClosePair(')')<CR>
":inoremap { {<CR>}<ESC>O
":inoremap } <c-r>=ClosePair('}')<CR>
:inoremap [ []<ESC>i
:inoremap ] <c-r>=ClosePair(']')<CR>
:inoremap " ""<ESC>i
:inoremap ' ''<ESC>i
function! ClosePair(char)
if getline('.')[col('.') - 1] == a:char
return "\<Right>"
else
return a:char
endif
endfunction
filetype plugin indent on
"打开文件类型检测, 加了这句才可以用智能补全
set completeopt=longest,menu
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""

设置sysctl.conf

[root@mobanji ~]# [ ! -e "/etc/sysctl.conf_bk" ] && /bin/mv /etc/sysctl.conf{,_bk} \
> && cat > /etc/sysctl.conf << EOF
> fs.file-max=1000000
> fs.nr_open=20480000
> net.ipv4.tcp_max_tw_buckets = 180000
> net.ipv4.tcp_sack = 1
> net.ipv4.tcp_window_scaling = 1
> net.ipv4.tcp_rmem = 4096 87380 4194304
> net.ipv4.tcp_wmem = 4096 16384 4194304
> net.ipv4.tcp_max_syn_backlog = 16384
> net.core.netdev_max_backlog = 32768
> net.core.somaxconn = 32768
> net.core.wmem_default = 8388608
> net.core.rmem_default = 8388608
> net.core.rmem_max = 16777216
> net.core.wmem_max = 16777216
> net.ipv4.tcp_timestamps = 0
> net.ipv4.tcp_fin_timeout = 20
> net.ipv4.tcp_synack_retries = 2
> net.ipv4.tcp_syn_retries = 2
> net.ipv4.tcp_syncookies = 1
> #net.ipv4.tcp_tw_len = 1
> net.ipv4.tcp_tw_reuse = 1
> net.ipv4.tcp_mem = 94500000 915000000 927000000
> net.ipv4.tcp_max_orphans = 3276800
> net.ipv4.ip_local_port_range = 1024 65000
> #net.nf_conntrack_max = 6553500
> #net.netfilter.nf_conntrack_max = 6553500
> #net.netfilter.nf_conntrack_tcp_timeout_close_wait = 60
> #net.netfilter.nf_conntrack_tcp_timeout_fin_wait = 120
> #net.netfilter.nf_conntrack_tcp_timeout_time_wait = 120
> #net.netfilter.nf_conntrack_tcp_timeout_established = 3600
> EOF
[root@mobanji ~]# sysctl -p

科目目录

#脚本目录
[root@mobanji ~]#mkdir -p /service/scripts

至此：模板机优化完毕

1.4.3 虚机准备

节点名称	IP		安装软件	角色
jumpserver	20.0.0.200		jumpserver	堡垒机
k8s-master01	20.0.0.201		kubeadm、kubelet、kubectl、docker、etcd	master节点
k8s-master02	20.0.0.202
			ceph
k8s-master03	20.0.0.203
k8s-node01	20.0.0.204		kubeadm、kubelet、kubectl、docker	业务节点
k8s-node02	20.0.0.205
k8s-node03	20.0.0.206
k8s-ha01	20.0.0.207 20.0.0.208	VIP：20.0.0.250	haproxy、keepalived、ceph	VIP
k8s-ha02
k8s-ceph	20.0.0.209		ceph	存储节点

以k8s-master01为例

#改主机名
[root@mobanji ~]# hostnamectl set-hostname k8s-master01
[root@mobanji ~]# bash
[root@k8s-master01 ~]#
#改IP
[root@k8s-master01 ~]# vimn
TYPE=Ethernet
BOOTPROTO=none
NAME=eth0
DEVICE=eth0
ONBOOT=yes
IPADDR=20.0.0.201
PREFIX=24
GATEWAY=20.0.0.2
DNS1=223.5.5.5
[root@k8s-master01 ~]# systemctl restart network
[root@k8s-master01 ~]# ping www.baidu.com
PING www.baidu.com (61.135.169.121) 56(84) bytes of data.
64 bytes from 61.135.169.121 (61.135.169.121): icmp_seq=1 ttl=128 time=43.3 ms
^C
--- www.baidu.com ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 43.348/43.348/43.348/0.000 ms
[root@k8s-master01 ~]# hostname -I
20.0.0.201
注：
init 0   -->  快照

至此：虚机准备完毕

1.5 集群

1.5.1 部署负载均衡高可用

以k8s-ha01为例

1.5.1.1 软件安装
#k8s-ha01和k8s-ha02
[root@k8s-ha01 ~]#  yum -y install keepalived haproxy -y


1.5.1.2 部署keepalived
#k8s-ha01和k8s-ha02
[root@k8s-ha01 ~]# cp /etc/keepalived/keepalived.conf{,.bak}
[root@k8s-ha01 ~]# > /etc/keepalived/keepalived.conf
 
#k8s-ha01
[root@k8s-ha01 ~]# cat /etc/keepalived/keepalived.conf
! Configuration File for keepalived
 
global_defs {
notification_email {    
[email protected]
[email protected]
}
   
notification_email_from [email protected]
smtp_server 127.0.0.1     
smtp_connect_timeout 30   
router_id master-node    
}
   
vrrp_script chk_haproxy_port {     
    script "/service/scripts/chk_hapro.sh"
    interval 2                  
    weight -5                
    fall 2              
    rise 1                 
}
   
vrrp_instance VI_1 {   
    state MASTER   
    interface eth0
    virtual_router_id 51        
    priority 101               
    advert_int 1
unicast_src_ip 20.0.0.207
    unicast_peer {
      20.0.0.208
    }                
    authentication {           
        auth_type PASS         
        auth_pass 1111         
    }
    virtual_ipaddress {       
        20.0.0.250 dev eth0 label eth0:1
    }
track_script {                     
       chk_haproxy_port                   
}
}
 
[root@k8s-ha01 ~]# scp /etc/keepalived/keepalived.conf 20.0.0.208:/etc/keepalived/keepalived.conf
 
#k8s-ha02
[root@k8s-ha02 ~]# vim /etc/keepalived/keepalived.conf
! Configuration File for keepalived
 
global_defs {
notification_email {    
[email protected]
[email protected]
}
   
notification_email_from [email protected]
smtp_server 127.0.0.1     
smtp_connect_timeout 30   
router_id master-node    
}
   
vrrp_script chk_http_port {     
    script "/service/scripts/chk_hapro.sh"
    interval 3                  
    weight -2                 
    fall 2              
    rise 1                 
}
   
vrrp_instance VI_1 {   
    state MASTER   
    interface eth0
    virtual_router_id 51        
    priority 90              
    advert_int 1
unicast_src_ip 20.0.0.208
    unicast_peer {
      20.0.0.207
    }                
    authentication {           
        auth_type PASS         
        auth_pas s 1111         
    }
    virtual_ipaddress {       
        20.0.0.250 dev eth0 label eth0:1
    }
track_script {                     
       check_haproxy                   
}
}


1.5.1.3 部署haproxy
#k8s-ha01和k8s-ha02
[root@k8s-ha01 ~]# cp /etc/haproxy/haproxy.cfg{,.bak}
[root@k8s-ha01 ~]# > /etc/haproxy/haproxy.cfg
 
#k8s-ha01
[root@k8s-ha01 ~]# vim /etc/haproxy/haproxy.cfg
[root@k8s-ha01 ~]# cat /etc/haproxy/haproxy.cfg
#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
maxconn 100000
#chroot /var/haproxy/lib/haproxy
#stats socket /var/lib/haproxy/haproxy.sock mode 600 level admin
uid 99
gid 99
daemon
nbproc 2
cpu-map 1 0
cpu-map 2 1
#pidfile /var/haproxy/run/haproxy.pid
log 127.0.0.1 local3 info
 
defaults
option http-keep-alive
option  forwardfor
maxconn 100000
mode http
timeout connect 300000ms
timeout client  300000ms
timeout server  300000ms
 
listen stats
 mode http
 bind 0.0.0.0:9999
 stats enable
 log global
 stats uri     /haproxy-status
 stats auth    admin:zisefeizhu
 
#K8S-API-Server
frontend K8S_API
    bind *:8443
    mode tcp
    default_backend k8s_api_nodes_6443
 
backend k8s_api_nodes_6443
    mode tcp
    balance leastconn
    server 20.0.0.201  20.0.0.201:6443  check inter 2000 fall 3 rise 5
    server 20.0.0.202  20.0.0.202:6443  check inter 2000 fall 3 rise 5
    server 20.0.0.203  20.0.0.203:6443  check inter 2000 fall 3 rise 5
 
#k8s-ha02
[root@k8s-ha01 ~]# scp /etc/haproxy/haproxy.cfg 20.0.0.208:/etc/haproxy/haproxy.cfg   
1.5.1.4 设置服务启动顺序及依赖关系
#k8s-ha01和k8s-ha02
[root@k8s-ha01 ~]# vim /usr/lib/systemd/system/keepalived.service
[Unit] 
Description=LVS and VRRP High Availability Monitor
After=syslog.target network-online.target haproxy.service
Requires=haproxy.service
......


1.5.1.5 检查脚本
[root@k8s-ha01 ~]# vim /service/scripts/chk_hapro.sh
[root@k8s-ha01 ~]# cat /service/scripts/chk_hapro.sh
##########################################################################
#Author:                     zisefeizhu
#QQ:                         2********0
#Date:                       2019-07-26
#FileName:                   /service/scripts/chk_hapro.sh
#URL:                        https://www.cnblogs.com/zisefeizhu/
#Description:                The test script
#Copyright (C):              2019 All rights reserved
##########################################################################
#!/bin/bash
counts=$(ps -ef|grep -w "haproxy"|grep -v grep|wc -l)
if [ "${counts}" = "0" ]; then
    systemctl restart keepalived.service
    sleep 2
    counts=$(ps -ef|grep -w "haproxy"|grep -v grep|wc -l)
    if [ "${counts}" = "0" ]; then
        systemctl stop keepalived.service
    fi
fi


1.5.1.6 启动服务
[root@k8s-ha01 ~]# systemctl enable keepalived && systemctl start keepalived && systemctl enable haproxy && systemctl start haproxy && systemctl status keepalived && systemctl status haproxy

1.5.1.7 测试

[root@k8s-ha01 ~]# systemctl stop keepalived
#刷新浏览器
[root@k8s-ha01 ~]# systemctl start keepalived
[root@k8s-ha01 ~]# systemctl stop haproxy
#刷新浏览器

1.5.2 部署kubernetes集群

1.5.2.1 虚机初始化

以k8s-master01为例

为每台虚机添加host解析记录

[root@k8s-master01 ~]# cat >> /etc/hosts << EOF
> 20.0.0.201  k8s-master01
> 20.0.0.202  k8s-master02
> 20.0.0.203  k8s-master03
> 20.0.0.204  k8s-node01
> 20.0.0.205  k8s-node02
> 20.0.0.206  k8s-node03
> EOF

免密钥登陆

[root@k8s-master01 ~]# vim /service/scripts/ssh-copy.sh
##########################################################################
#Author:                     zisefeizhu
#QQ:                         2********0
#Date:                       2019-07-27
#FileName:                   /service/scripts/ssh-copy.sh
#URL:                        https://www.cnblogs.com/zisefeizhu/
#Description:                The test script
#Copyright (C):              2019 All rights reserved
##########################################################################
#!/bin/bash
#目标主机列表
IP="
20.0.0.201
k8s-master01
20.0.0.202
k8s-master02
20.0.0.203
k8s-master03
20.0.0.204
k8s-node01
20.0.0.205
k8s-node02
20.0.0.206
k8s-node03
"
for node in ${IP};do
  sshpass -p 1 ssh-copy-id  ${node}  -o StrictHostKeyChecking=no
  if [ $? -eq 0 ];then
    echo "${node} 秘钥copy完成"
  else
    echo "${node} 秘钥copy失败"
  fi
done
[root@k8s-master01 ~]# ssh-keygen -t rsa
[root@k8s-master01 ~]# sh /service/scripts/ssh-copy.sh

关闭交换分区

[root@k8s-master01 ~]# swapoff -a
[root@k8s-master01 ~]# yes | cp /etc/fstab /etc/fstab_bak
[root@k8s-master01 ~]# cat /etc/fstab_bak |grep -v swap > /etc/fstab

添加k8s源

[root@k8s-master01 ~]# cat << EOF > /etc/yum.repos.d/kubernetes.repo
> [kubernetes]
> name=Kubernetes
> baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
> enabled=1
> gpgcheck=1
> repo_gpgcheck=1
> gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
> EOF

1.5.2.2 安装docker

k8s-master01为例

安装必要的一些系统工具

[root@k8s-master01 ~]# yum install -y yum-utils device-mapper-persistent-data lvm2

安装docker

[root@k8s-master01 ~]# yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
[root@k8s-master01 ~]# yum list docker-ce --showduplicates | sort -r
[root@k8s-master01 ~]# yum -y install docker-ce-18.06.3.ce-3.el7

配置daemon.json

#获取镜像加速
阿里云
　　 打开网址：https://cr.console.aliyun.com/#/accelerator
        注册、登录、设置密码
        然后在页面上可以看到加速器地址，类似于：https://123abc.mirror.aliyuncs.com
腾讯云（非腾讯云主机不可用）
加速地址：https://mirror.ccs.tencentyun.com
 
#配置
[root@k8s-master01 ~]# mkdir -p /etc/docker/ \
&& cat > /etc/docker/daemon.json << EOF
{
     "registry-mirrors":[
         "https://c6ai9izk.mirror.aliyuncs.com"
     ],
     "max-concurrent-downloads":3,
     "data-root":"/data/docker",
     "log-driver":"json-file",
     "log-opts":{
         "max-size":"100m",
         "max-file":"1"
     },
     "max-concurrent-uploads":5,
     "storage-driver":"overlay2",
     "storage-opts": [
     "overlay2.override_kernel_check=true"
   ],
  "live-restore": true,       <--- 保证 docker daemon重启，但容器不重启
    "exec-opts": [
    "native.cgroupdriver=systemd"
    ]
 }
 EOF

启动检查docker

[root@k8s-master01 ~]# systemctl enable docker \
> && systemctl restart docker \
> && systemctl status docker

注：daemon.json详解

{
    "authorization-plugins": [],   //访问授权插件
    "data-root": "",   //docker数据持久化存储的根目录
    "dns": [],   //DNS服务器
    "dns-opts": [],   //DNS配置选项，如端口等
    "dns-search": [],   //DNS搜索域名
    "exec-opts": [],   //执行选项
    "exec-root": "",   //执行状态的文件的根目录
    "experimental": false,   //是否开启试验性特性
    "storage-driver": "",   //存储驱动器
    "storage-opts": [],   //存储选项
    "labels": [],   //键值对式标记docker元数据
    "live-restore": true,   //dockerd挂掉是否保活容器（避免了docker服务异常而造成容器退出）
    "log-driver": "",   //容器日志的驱动器
    "log-opts": {},   //容器日志的选项
    "mtu": 0,   //设置容器网络MTU（最大传输单元）
    "pidfile": "",   //daemon PID文件的位置
    "cluster-store": "",   //集群存储系统的URL
    "cluster-store-opts": {},   //配置集群存储
    "cluster-advertise": "",   //对外的地址名称
    "max-concurrent-downloads": 3,   //设置每个pull进程的最大并发
    "max-concurrent-uploads": 5,   //设置每个push进程的最大并发
    "default-shm-size": "64M",   //设置默认共享内存的大小
    "shutdown-timeout": 15,   //设置关闭的超时时限(who?)
    "debug": true,   //开启调试模式
    "hosts": [],   //监听地址(?)
    "log-level": "",   //日志级别
    "tls": true,   //开启传输层安全协议TLS
    "tlsverify": true,   //开启输层安全协议并验证远程地址
    "tlscacert": "",   //CA签名文件路径
    "tlscert": "",   //TLS证书文件路径
    "tlskey": "",   //TLS密钥文件路径
    "swarm-default-advertise-addr": "",   //swarm对外地址
    "api-cors-header": "",   //设置CORS（跨域资源共享-Cross-origin resource sharing）头
    "selinux-enabled": false,   //开启selinux(用户、进程、应用、文件的强制访问控制)
    "userns-remap": "",   //给用户命名空间设置 用户/组
    "group": "",   //docker所在组
    "cgroup-parent": "",   //设置所有容器的cgroup的父类(?)
    "default-ulimits": {},   //设置所有容器的ulimit
    "init": false,   //容器执行初始化，来转发信号或控制(reap)进程
    "init-path": "/usr/libexec/docker-init",   //docker-init文件的路径
    "ipv6": false,   //开启IPV6网络
    "iptables": false,   //开启防火墙规则
    "ip-forward": false,   //开启net.ipv4.ip_forward
    "ip-masq": false,   //开启ip掩蔽(IP封包通过路由器或防火墙时重写源IP地址或目的IP地址的技术)
    "userland-proxy": false,   //用户空间代理
    "userland-proxy-path": "/usr/libexec/docker-proxy",   //用户空间代理路径
    "ip": "0.0.0.0",   //默认IP
    "bridge": "",   //将容器依附(attach)到桥接网络上的桥标识
    "bip": "",   //指定桥接ip
    "fixed-cidr": "",   //(ipv4)子网划分，即限制ip地址分配范围，用以控制容器所属网段实现容器间(同一主机或不同主机间)的网络访问
    "fixed-cidr-v6": "",   //（ipv6）子网划分
    "default-gateway": "",   //默认网关
    "default-gateway-v6": "",   //默认ipv6网关
    "icc": false,   //容器间通信
    "raw-logs": false,   //原始日志(无颜色、全时间戳)
    "allow-nondistributable-artifacts": [],   //不对外分发的产品提交的registry仓库
    "registry-mirrors": [],   //registry仓库镜像
    "seccomp-profile": "",   //seccomp配置文件
    "insecure-registries": [],   //非https的registry地址
    "no-new-privileges": false,   //禁止新优先级(??)
    "default-runtime": "runc",   //OCI联盟(The Open Container Initiative)默认运行时环境
    "oom-score-adjust": -500,   //内存溢出被杀死的优先级(-1000~1000)
    "node-generic-resources": ["NVIDIA-GPU=UUID1", "NVIDIA-GPU=UUID2"],   //对外公布的资源节点
    "runtimes": {   //运行时
        "cc-runtime": {
            "path": "/usr/bin/cc-runtime"
        },
        "custom": {
            "path": "/usr/local/bin/my-runc-replacement",
            "runtimeArgs": [
                "--debug"
            ]
        }
    }
}

1.5.2.3 使用kubeadm部署kubernetes

以k8s-master01为例

安装必备软件

[root@k8s-master01 ~]# yum list  kubelet kubeadm kubectl --showduplicates | sort -r
[root@k8s-master01 ~]# yum install -y kubelet-1.15.1 kubeadm-1.15.1 kubectl-1.15.1 ipvsadm ipset
 
##设置kubelet开机自启动,注意：这一步不能直接执行 systemctl start kubelet，会报错，成功初始化完后kubelet会自动起来
[root@k8s-master01 ~]# systemctl enable kubelet
 
#kubectl命令补全
[root@k8s-master01 ~]# source /usr/share/bash-completion/bash_completion
[root@k8s-master01 ~]# source <(kubectl completion bash)
[root@k8s-master01 ~]# echo "source <(kubectl completion bash)" >> ~/.bashrc

修改初始化配置

使用kubeadm config print init-defaults > kubeadm-init.yaml 打印出默认配置，然后在根据自己的环境修改配置
注意
需要修改advertiseAddress、controlPlaneEndpoint、imageRepository、serviceSubnet、kubernetesVersion
advertiseAddress为master01的ip
controlPlaneEndpoint为VIP+8443端口
imageRepository修改为阿里的源
serviceSubnet找网络组要一段没有使用的IP段
kubernetesVersion和上一步的版本一致
[root@k8s-master01 ~]# cd /data/
[root@k8s-master01 data]# ll
[root@k8s-master01 data]# mkdir tmp
[root@k8s-master01 data]# cd tmp
[root@k8s-master01 tmp]# kubeadm config print init-defaults > kubeadm-init.yaml
[root@k8s-master01 tmp]# cp kubeadm-init.yaml{,.bak}
[root@k8s-master01 tmp]# vim kubeadm-init.yaml
[root@k8s-master01 tmp]# diff kubeadm-init.yaml{,.bak}
12c12
<   advertiseAddress: 20.0.0.201
---
>   advertiseAddress: 1.2.3.4
26d25
< controlPlaneEndpoint: "20.0.0.250:8443"
33c32
< imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers
---
> imageRepository: k8s.gcr.io
35c34
< kubernetesVersion: v1.15.1
---
> kubernetesVersion: v1.14.0
38c37
<   serviceSubnet: 10.0.0.0/16
---
>   serviceSubnet: 10.96.0.0/12

下载镜像

#查看所需镜像版本
[root@k8s-master01 tmp]# kubeadm config images list
k8s.gcr.io/kube-apiserver:v1.15.1
k8s.gcr.io/kube-controller-manager:v1.15.1
k8s.gcr.io/kube-scheduler:v1.15.1
k8s.gcr.io/kube-proxy:v1.15.1
k8s.gcr.io/pause:3.1
k8s.gcr.io/etcd:3.3.10
k8s.gcr.io/coredns:1.3.1
#下载所需镜像
[root@k8s-master01 tmp]# kubeadm config images pull --config kubeadm-init.yaml
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.15.1
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.15.1
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.15.1
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.15.1
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.1
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.3.10
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:1.3.1

初始化

[init] Using Kubernetes version: v1.15.1
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Activating the kubelet service
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [k8s-master01 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.0.0.1 20.0.0.201 20.0.0.250]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [k8s-master01 localhost] and IPs [20.0.0.201 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [k8s-master01 localhost] and IPs [20.0.0.201 127.0.0.1 ::1]
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "admin.conf" kubeconfig file
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
[apiclient] All control plane components are healthy after 57.514816 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.15" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node k8s-master01 as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node k8s-master01 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: abcdef.0123456789abcdef
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: CoreDNS
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[addons] Applied essential addon: kube-proxy
 
Your Kubernetes control-plane has initialized successfully!
 
To start using your cluster, you need to run the following as a regular user:
 
  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config
 
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/
 
You can now join any number of control-plane nodes by copying certificate authorities
and service account keys on each node and then running the following as root:
 
  kubeadm join 20.0.0.250:8443 --token abcdef.0123456789abcdef \
    --discovery-token-ca-cert-hash sha256:cdfa555306ee75391e03eef75b8fa16ba121f5a9effe85e81874f6207b610c9f \
    --control-plane   
 
Then you can join any number of worker nodes by running the following on each as root:
 
kubeadm join 20.0.0.250:8443 --token abcdef.0123456789abcdef \
    --discovery-token-ca-cert-hash sha256:cdfa555306ee75391e03eef75b8fa16ba121f5a9effe85e81874f6207b610c9f

注：kubeadm init主要执行了以下操作

[init]：指定版本进行初始化操作
[preflight] ：初始化前的检查和下载所需要的Docker镜像文件
[kubelet-start] ：生成kubelet的配置文件”/var/lib/kubelet/config.yaml”，没有这个文件kubelet无法启动，所以初始化之前的kubelet实际上启动失败。
[certificates]：生成Kubernetes使用的证书，存放在/etc/kubernetes/pki目录中。
[kubeconfig] ：生成 KubeConfig 文件，存放在/etc/kubernetes目录中，组件之间通信需要使用对应文件。
[control-plane]：使用/etc/kubernetes/manifest目录下的YAML文件，安装 Master 组件。
[etcd]：使用/etc/kubernetes/manifest/etcd.yaml安装Etcd服务。
[wait-control-plane]：等待control-plan部署的Master组件启动。
[apiclient]：检查Master组件服务状态。
[uploadconfig]：更新配置
[kubelet]：使用configMap配置kubelet。
[patchnode]：更新CNI信息到Node上，通过注释的方式记录。
[mark-control-plane]：为当前节点打标签，打了角色Master，和不可调度标签，这样默认就不会使用Master节点来运行Pod。
[bootstrap-token]：生成token记录下来，后边使用kubeadm join往集群中添加节点时会用到
[addons]：安装附加组件CoreDNS和kube-proxy

为kubectl 准备kubeconfig文件

#kubectl默认会在执行的用户家目录下面的.kube目录下寻找config文件。这里是将在初始化时[kubeconfig]步骤生成的admin.conf拷贝到.kube/config
[root@k8s-master01 tmp]# mkdir -p $HOME/.kube
[root@k8s-master01 tmp]# sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
[root@k8s-master01 tmp]# sudo chown $(id -u):$(id -g) $HOME/.kube/config

查看组件状态

[root@k8s-master01 tmp]# kubectl get cs
NAME                 STATUS    MESSAGE             ERROR
scheduler            Healthy   ok                  
controller-manager   Healthy   ok                  
etcd-0               Healthy   {"health":"true"}   
[root@k8s-master01 tmp]# kubectl get nodes
NAME           STATUS     ROLES    AGE     VERSION
k8s-master01   NotReady   master   6m23s   v1.15.1
 
目前只有一个节点，角色是Master，状态是NotReady，状态是NotReady状态是因为还没有安装网络插件

部署其他master

在k8s-master01将证书文件拷贝至k8s-master02、k8s-master03节点
在k8s-master01上部署
#拷贝证书至k8s-master02节点
[root@k8s-master01 ~]# vim /service/scripts/k8s-master-zhengshu.sh
##########################################################################
#Author:                     zisefeizhu
#QQ:                         2********0
#Date:                       2019-07-27
#FileName:                   /service/scripts/k8s-master-zhengshu-master02.sh
#URL:                        https://www.cnblogs.com/zisefeizhu/
#Description:                The test script
#Copyright (C):              2019 All rights reserved
##########################################################################
#!/bin/bash
USER=root
CONTROL_PLANE_IPS="k8s-master02 k8s-master03"
for host in ${CONTROL_PLANE_IPS}; do
    ssh "${USER}"@$host "mkdir -p /etc/kubernetes/pki/etcd"
    scp /etc/kubernetes/pki/ca.* "${USER}"@$host:/etc/kubernetes/pki/
    scp /etc/kubernetes/pki/sa.* "${USER}"@$host:/etc/kubernetes/pki/
    scp /etc/kubernetes/pki/front-proxy-ca.* "${USER}"@$host:/etc/kubernetes/pki/
    scp /etc/kubernetes/pki/etcd/ca.* "${USER}"@$host:/etc/kubernetes/pki/etcd/
    scp /etc/kubernetes/admin.conf "${USER}"@$host:/etc/kubernetes/
don
[root@k8s-master01 ~]# sh -x /service/scripts/k8s-master-zhengshu-master02.sh
 
#在k8s-master02上执行，注意注意--experimental-control-plane参数
[root@k8s-master02 ~]# kubeadm join 20.0.0.250:8443 --token abcdef.0123456789abcdef \
>     --discovery-token-ca-cert-hash sha256:cdfa555306ee75391e03eef75b8fa16ba121f5a9effe85e81874f6207b610c9f  \
>    --experimental-control-plane
Flag --experimental-control-plane has been deprecated, use --control-plane instead
[preflight] Running pre-flight checks
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[preflight] Running pre-flight checks before initializing the new control plane instance
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [k8s-master02 localhost] and IPs [20.0.0.202 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [k8s-master02 localhost] and IPs [20.0.0.202 127.0.0.1 ::1]
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [k8s-master02 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.0.0.1 20.0.0.202 20.0.0.250]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[certs] Using the existing "sa" key
[kubeconfig] Generating kubeconfig files
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Using existing kubeconfig file: "/etc/kubernetes/admin.conf"
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[check-etcd] Checking that the etcd cluster is healthy
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.15" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Activating the kubelet service
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
[etcd] Announced new etcd member joining to the existing etcd cluster
[etcd] Wrote Static Pod manifest for a local etcd member to "/etc/kubernetes/manifests/etcd.yaml"
[etcd] Waiting for the new etcd member to join the cluster. This can take up to 40s
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[mark-control-plane] Marking the node k8s-master02 as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node k8s-master02 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
 
This node has joined the cluster and a new control plane instance was created:
 
* Certificate signing request was sent to apiserver and approval was received.
* The Kubelet was informed of the new secure connection details.
* Control plane (master) label and taint were applied to the new node.
* The Kubernetes control plane instances scaled up.
* A new etcd member was added to the local/stacked etcd cluster.
 
To start administering your cluster from this node, you need to run the following as a regular user:
 
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
 
Run 'kubectl get nodes' to see this node join the cluster.
[root@k8s-master02 ~]# mkdir -p $HOME/.kube
[root@k8s-master02 ~]# sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
[root@k8s-master02 ~]# sudo chown $(id -u):$(id -g) $HOME/.kube/config
 
#在k8s-master02上执行，注意注意--experimental-control-plane参数
[root@k8s-master03 ~]# kubeadm join 20.0.0.250:8443 --token abcdef.0123456789abcdef \
>     --discovery-token-ca-cert-hash sha256:cdfa555306ee75391e03eef75b8fa16ba121f5a9effe85e81874f6207b610c9f  \
>    --experimental-control-plane
Flag --experimental-control-plane has been deprecated, use --control-plane instead
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[preflight] Running pre-flight checks before initializing the new control plane instance
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [k8s-master03 localhost] and IPs [20.0.0.203 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [k8s-master03 localhost] and IPs [20.0.0.203 127.0.0.1 ::1]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [k8s-master03 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.0.0.1 20.0.0.203 20.0.0.250]
[certs] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[certs] Using the existing "sa" key
[kubeconfig] Generating kubeconfig files
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Using existing kubeconfig file: "/etc/kubernetes/admin.conf"
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[check-etcd] Checking that the etcd cluster is healthy
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.15" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Activating the kubelet service
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
[etcd] Announced new etcd member joining to the existing etcd cluster
[etcd] Wrote Static Pod manifest for a local etcd member to "/etc/kubernetes/manifests/etcd.yaml"
[etcd] Waiting for the new etcd member to join the cluster. This can take up to 40s
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[mark-control-plane] Marking the node k8s-master03 as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node k8s-master03 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
 
This node has joined the cluster and a new control plane instance was created:
 
* Certificate signing request was sent to apiserver and approval was received.
* The Kubelet was informed of the new secure connection details.
* Control plane (master) label and taint were applied to the new node.
* The Kubernetes control plane instances scaled up.
* A new etcd member was added to the local/stacked etcd cluster.
 
To start administering your cluster from this node, you need to run the following as a regular user:
 
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
 
Run 'kubectl get nodes' to see this node join the cluster.
 
[root@k8s-master03 ~]# mkdir -p $HOME/.kube
[root@k8s-master03 ~]# sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
[root@k8s-master03 ~]# sudo chown $(id -u):$(id -g) $HOME/.kube/config
 
[root@k8s-master03 ~]# kubectl get nodes
NAME           STATUS     ROLES    AGE     VERSION
k8s-master01   NotReady   master   38m     v1.15.1
k8s-master02   NotReady   master   4m52s   v1.15.1
k8s-master03   NotReady   master   84s     v1.15.1

node节点部署

在k8s-node01、k8s-node02、k8s-node03执行,注意没有--experimental-control-plane参数

注意**：token有效期是有限的，如果旧的token过期，可以在master节点上使用kubeadm token create --print-join-command重新创建一条token。

在业务节点上执行下面这条命令

以k8s-node01为例

[root@k8s-node01 ~]# kubeadm join 20.0.0.250:8443 --token abcdef.0123456789abcdef \
>     --discovery-token-ca-cert-hash sha256:cdfa555306ee75391e03eef75b8fa16ba121f5a9effe85e81874f6207b610c9f
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.15" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Activating the kubelet service
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
 
This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.
 
Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
 
[root@k8s-master01 ~]# kubectl get nodes
NAME           STATUS     ROLES    AGE     VERSION
k8s-master01   NotReady   master   45m     v1.15.1
k8s-master02   NotReady   master   12m     v1.15.1
k8s-master03   NotReady   master   8m49s   v1.15.1
k8s-node01     NotReady   <none>   3m46s   v1.15.1
k8s-node02     NotReady   <none>   3m42s   v1.15.1
k8s-node03     NotReady   <none>   24s     v1.15.1

网络插件calico

#下载calico.yaml文件
[root@k8s-master01 tmp]# wget -c https://docs.projectcalico.org/v3.6/getting-started/kubernetes/installation/hosted/kubernetes-datastore/calico-networking/1.7/calico.yaml
 
#修改calico.yaml，修改CALICO_IPV4POOL_CIDR这个下面的vaule值。在前面设置的serviceSubnet的值
[root@k8s-master01 tmp]# cp calico.yaml{,.bak}
[root@k8s-master01 tmp]# vim calico.yaml
[root@k8s-master01 tmp]# diff calico.yaml{,.bak}
598c598
<               value: "10.0.0.0/16"
---
>               value: "192.168.0.0/16"
 
#安装
[root@k8s-master01 tmp]# kubectl apply -f calico.yaml
configmap/calico-config created
customresourcedefinition.apiextensions.k8s.io/felixconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamblocks.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/blockaffinities.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamhandles.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamconfigs.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgppeers.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgpconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ippools.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/hostendpoints.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/clusterinformations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworksets.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networkpolicies.crd.projectcalico.org created
clusterrole.rbac.authorization.k8s.io/calico-kube-controllers created
clusterrolebinding.rbac.authorization.k8s.io/calico-kube-controllers created
clusterrole.rbac.authorization.k8s.io/calico-node created
clusterrolebinding.rbac.authorization.k8s.io/calico-node created
daemonset.extensions/calico-node created
serviceaccount/calico-node created
deployment.extensions/calico-kube-controllers created
serviceaccount/calico-kube-controllers created

查看节点状态

[root@k8s-master01 tmp]# kubectl get nodes
NAME           STATUS     ROLES    AGE   VERSION
k8s-master01   Ready      master   59m   v1.15.1
k8s-master02   Ready      master   25m   v1.15.1
k8s-master03   Ready      master   22m   v1.15.1
k8s-node01     NotReady   <none>   17m   v1.15.1
k8s-node02     NotReady   <none>   17m   v1.15.1
k8s-node03     NotReady   <none>   14m   v1.15.1

kube-proxy开启ipvs

#修改ConfigMap的kube-system/kube-proxy中的config.conf，mode: "ipvs"：
[root@k8s-master01 ~]# kubectl edit cm kube-proxy -n kube-system
 
#重启kube-proxy pod
[root@k8s-master01 ~]#  kubectl get pod -n kube-system | grep kube-proxy | awk '{system("kubectl delete pod "$1" -n kube-system")}'
pod "kube-proxy-4skt5" deleted
pod "kube-proxy-fxjl5" deleted
pod "kube-proxy-k5q6x" deleted
pod "kube-proxy-q47jk" deleted
pod "kube-proxy-rc6pg" deleted
pod "kube-proxy-wwm49" deleted
 
#查看Kube-proxy pod状态
[root@k8s-master01 ~]# kubectl get pod -n kube-system | grep kube-proxy
kube-proxy-7vg6s                           1/1     Running   0          82s
kube-proxy-dtpvd                           1/1     Running   0          2m2s
kube-proxy-hd8sk                           1/1     Running   0          114s
kube-proxy-lscgw                           1/1     Running   0          97s
kube-proxy-ssv94                           1/1     Running   0          106s
kube-proxy-vdlx7                           1/1     Running   0          79s
 
#查看是否开启了ipvs
[root@k8s-master01 ~]# kubectl logs kube-proxy-ssv94 -n kube-system
I0727 02:23:52.411755       1 server_others.go:170] Using ipvs Proxier.
W0727 02:23:52.412270       1 proxier.go:395] clusterCIDR not specified, unable to distinguish between internal and external traffic
W0727 02:23:52.412302       1 proxier.go:401] IPVS scheduler not specified, use rr by default
I0727 02:23:52.412480       1 server.go:534] Version: v1.15.1
I0727 02:23:52.427788       1 conntrack.go:52] Setting nf_conntrack_max to 131072
I0727 02:23:52.428163       1 config.go:187] Starting service config controller
I0727 02:23:52.428199       1 config.go:96] Starting endpoints config controller
I0727 02:23:52.428221       1 controller_utils.go:1029] Waiting for caches to sync for endpoints config controller
I0727 02:23:52.428233       1 controller_utils.go:1029] Waiting for caches to sync for service config controller
I0727 02:23:52.628536       1 controller_utils.go:1036] Caches are synced for service config controller
I0727 02:23:52.628636       1 controller_utils.go:1036] Caches are synced for endpoints config controller
[root@k8s-master01 ~]# kubectl logs kube-proxy-ssv94 -n kube-system  | grep "ipvs"
I0727 02:23:52.411755       1 server_others.go:170] Using ipvs Proxier.

查看ipvs状态

[root@k8s-master01 ~]# ipvsadm -Ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  10.0.0.1:443 rr
  -> 20.0.0.201:6443              Masq    1      0          0         
  -> 20.0.0.202:6443              Masq    1      0          0         
  -> 20.0.0.203:6443              Masq    1      0          0         
TCP  10.0.0.10:53 rr
  -> 10.0.122.129:53              Masq    1      0          0         
  -> 10.0.195.0:53                Masq    1      0          0         
TCP  10.0.0.10:9153 rr
  -> 10.0.122.129:9153            Masq    1      0          0         
  -> 10.0.195.0:9153              Masq    1      0          0         
UDP  10.0.0.10:53 rr
  -> 10.0.122.129:53              Masq    1      0          0         
  -> 10.0.195.0:53                Masq    1      0          0

查看集群状态

[root@k8s-master01 ~]# kubectl get all -n kube-system
NAME                                           READY   STATUS    RESTARTS   AGE
pod/calico-kube-controllers-7c4d64d599-w24xk   1/1     Running   0          23m
pod/calico-node-9hzdk                          1/1     Running   0          23m
pod/calico-node-c7xbq                          1/1     Running   0          23m
pod/calico-node-gz967                          1/1     Running   0          23m
pod/calico-node-hkcjr                          1/1     Running   0          23m
pod/calico-node-pb9h4                          1/1     Running   0          23m
pod/calico-node-w75b8                          1/1     Running   0          23m
pod/coredns-6967fb4995-wv2j5                   1/1     Running   0          77m
pod/coredns-6967fb4995-ztrlt                   1/1     Running   1          77m
pod/etcd-k8s-master01                          1/1     Running   0          76m
pod/etcd-k8s-master02                          1/1     Running   0          44m
pod/etcd-k8s-master03                          1/1     Running   0          40m
pod/kube-apiserver-k8s-master01                1/1     Running   0          76m
pod/kube-apiserver-k8s-master02                1/1     Running   0          44m
pod/kube-apiserver-k8s-master03                1/1     Running   0          39m
pod/kube-controller-manager-k8s-master01       1/1     Running   4          76m
pod/kube-controller-manager-k8s-master02       1/1     Running   1          44m
pod/kube-controller-manager-k8s-master03       1/1     Running   2          39m
pod/kube-proxy-7vg6s                           1/1     Running   0          13m
pod/kube-proxy-dtpvd                           1/1     Running   0          14m
pod/kube-proxy-hd8sk                           1/1     Running   0          13m
pod/kube-proxy-lscgw                           1/1     Running   0          13m
pod/kube-proxy-ssv94                           1/1     Running   0          13m
pod/kube-proxy-vdlx7                           1/1     Running   0          13m
pod/kube-scheduler-k8s-master01                1/1     Running   4          76m
pod/kube-scheduler-k8s-master02                1/1     Running   1          44m
pod/kube-scheduler-k8s-master03                1/1     Running   1          39m
 
 
NAME               TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)                  AGE
service/kube-dns   ClusterIP   10.0.0.10    <none>        53/UDP,53/TCP,9153/TCP   77m
 
NAME                         DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR                 AGE
daemonset.apps/calico-node   6         6         6       6            6           beta.kubernetes.io/os=linux   23m
daemonset.apps/kube-proxy    6         6         6       6            6           beta.kubernetes.io/os=linux   77m
 
NAME                                      READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/calico-kube-controllers   1/1     1            1           23m
deployment.apps/coredns                   2/2     2            2           77m
 
NAME                                                 DESIRED   CURRENT   READY   AGE
replicaset.apps/calico-kube-controllers-7c4d64d599   1         1         1       23m
replicaset.apps/coredns-6967fb4995                   2         2         2       77m

1.5.2.4 测试

#运行一个nginx pod
[root@k8s-master01 ~]# mkdir /data/yaml
[root@k8s-master01 ~]# cd /data/yaml
[root@k8s-master01 yaml]# vim nginx.yaml
##########################################################################
#Author:                     zisefeizhu
#QQ:                         2********0
#Date:                       2019-07-27
#FileName:                   nginx.yaml
#URL:                        https://www.cnblogs.com/zisefeizhu/
#Description:                The test script
#Copyright (C):              2019 All rights reserved
###########################################################################
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: my-nginx
spec:
  replicas: 3
  template:
    metadata:
      labels:
        run: my-nginx
    spec:
      containers:
      - name: my-nginx
        image: nginx:1.14
        ports:
        - containerPort: 80
[root@k8s-master01 yaml]# kubectl apply -f nginx.yaml
deployment.extensions/my-nginx created
 
#查看nginx pod
[root@k8s-master01 yaml]# kubectl get pods -o wide
NAME                        READY   STATUS    RESTARTS   AGE   IP             NODE         NOMINATED NODE   READINESS GATES
my-nginx-6b8796c8f4-2bgjg   1/1     Running   0          49s   10.0.135.130   k8s-node03   <none>           <none>
my-nginx-6b8796c8f4-t2hk6   1/1     Running   0          49s   10.0.58.194    k8s-node02   <none>           <none>
my-nginx-6b8796c8f4-t56rp   1/1     Running   0          49s   10.0.85.194    k8s-node01   <none>           <none>
 
#通过curl命令测试
[root@k8s-master01 yaml]# curl 10.0.135.130
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>
 
<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>
 
<p><em>Thank you for using nginx.</em></p>
</body>
</html>
 
#export 该 Deployment, 生成 my-nginx 服务
[root@k8s-master01 yaml]# kubectl expose deployment my-nginx
service/my-nginx exposed
[root@k8s-master01 yaml]# kubectl get service --all-namespaces | grep "my-nginx"
default       my-nginx     ClusterIP   10.0.225.139   <none>        80/TCP                   22s
注：能显示出Welcome to nginx，说明pod运行正常，间接也说明集群可以正常使用
#测试dns
[root@k8s-master01 yaml]# kubectl run curl --image=radial/busyboxplus:curl -it
kubectl run --generator=deployment/apps.v1 is DEPRECATED and will be removed in a future version. Use kubectl run --generator=run-pod/v1 or kubectl create instead.
If you don't see a command prompt, try pressing enter.
[ root@curl-6bf6db5c4f-9mpcs:/ ]$ nslookup kubernetes.default
Server:    10.0.0.10
Address 1: 10.0.0.10 kube-dns.kube-system.svc.cluster.local
 
Name:      kubernetes.default
Address 1: 10.0.0.1 kubernetes.default.svc.cluster.local
#init 6 --> kubectl get pods --all-namespaces --> init 0 --> 快照

1.5.3 dashboard

可以从微软中国提供的 gcr.io ：http://mirror.azure.cn/help/gcr-proxy-cache.html免费代理下载被墙的镜像

docker pull gcr.azk8s.cn/google_containers/<imagename>:<version>

下载文件

下载三个文件：https://github.com/gjmzj/kubeasz/tree/master/manifests/dashboard

[root@k8s-master01 ~]# mkdir /data/tmp/dashboard
[root@k8s-master01 ~]# cd /data/tmp/dashboard
[root@k8s-master01 dashboard]# ll
总用量 16
-rw-r--r-- 1 root root  844 7月  27 16:10 admin-user-sa-rbac.yaml
-rw-r--r-- 1 root root 5198 7月  27 16:13 kubernetes-dashboard.yaml
-rw-r--r-- 1 root root 2710 7月  27 16:12 read-user-sa-rbac.yaml

部署dashboard主yaml配置文件

#修改镜像下载地址
image: gcr.azk8s.cn/google_containers/kubernetes-dashboard-amd64:v1.10.1
[root@k8s-master01 dashboard]# kubectl apply -f kubernetes-dashboard.yaml
secret/kubernetes-dashboard-certs created
serviceaccount/kubernetes-dashboard created
role.rbac.authorization.k8s.io/kubernetes-dashboard-minimal created
rolebinding.rbac.authorization.k8s.io/kubernetes-dashboard-minimal created
deployment.apps/kubernetes-dashboard created
service/kubernetes-dashboard created

创建可读可写admin Service Account

[root@k8s-master01 dashboard]# kubectl apply -f admin-user-sa-rbac.yaml
serviceaccount/admin-user created
clusterrolebinding.rbac.authorization.k8s.io/admin-user created

创建只读 read Service Account

[root@k8s-master01 dashboard]# kubectl apply -f read-user-sa-rbac.yaml
serviceaccount/dashboard-read-user created
clusterrolebinding.rbac.authorization.k8s.io/dashboard-read-binding created
clusterrole.rbac.authorization.k8s.io/dashboard-read-clusterrole created

查看

#查看pod运行状态
[root@k8s-master01 dashboard]# kubectl get pod -n kube-system | grep dashboard
kubernetes-dashboard-fcfb4cbc-xrbkx        1/1     Running   0          2m38s
 
#查看dashboard service
[root@k8s-master01 dashboard]# kubectl get svc -n kube-system|grep dashboard
kubernetes-dashboard   NodePort    10.0.71.179   <none>        443:31021/TCP            2m47s
 
#查看集群服务
[root@k8s-master01 dashboard]# kubectl cluster-info
Kubernetes master is running at https://20.0.0.250:8443
KubeDNS is running at https://20.0.0.250:8443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
kubernetes-dashboard is running at https://20.0.0.250:8443/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy
 
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
 
#查看pod运行日志
[root@k8s-master01 dashboard]# kubectl logs kubernetes-dashboard-fcfb4cbc-xrbkx -n kube-system

生成证书

供本地google浏览器使用
#生成client-certificate-data
[root@k8s-master01 dashboard]# grep 'client-certificate-data' ~/.kube/config | head -n 1 | awk '{print $2}' | base64 -d >> kubecfg.crt
 
#生成client-key-data
[root@k8s-master01 dashboard]# grep 'client-key-data' ~/.kube/config | head -n 1 | awk '{print $2}' | base64 -d >> kubecfg.key
 
#生成p12
[root@k8s-master01 dashboard]# openssl pkcs12 -export -clcerts -inkey kubecfg.key -in kubecfg.crt -out kubecfg.p12 -name "kubernetes-client"
Enter Export Password:   1
Verifying - Enter Export Password:     1
[root@k8s-master01 dashboard]# ll
总用量 28
-rw-r--r-- 1 root root  844 7月  27 16:10 admin-user-sa-rbac.yaml
-rw-r--r-- 1 root root 1082 7月  27 16:23 kubecfg.crt
-rw-r--r-- 1 root root 1679 7月  27 16:23 kubecfg.key
-rw-r--r-- 1 root root 2464 7月  27 16:23 kubecfg.p12
-rw-r--r-- 1 root root 5198 7月  27 16:13 kubernetes-dashboard.yaml
-rw-r--r-- 1 root root 2710 7月  27 16:12 read-user-sa-rbac.yaml
[root@k8s-master01 dashboard]# sz kubecfg.p12

谷歌浏览器导入证书：
备注把上一步骤的kubecfg.p12 文件导入证书后需要重启浏览器：

导出令牌

[root@k8s-master01 dashboard]# kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep admin-user | awk '{print $1}')
Name:         admin-user-token-ggxf6
Namespace:    kube-system
Labels:       <none>
Annotations:  kubernetes.io/service-account.name: admin-user
              kubernetes.io/service-account.uid: a4cc757e-e710-49ea-8321-d4642d38bbf5
 
Type:  kubernetes.io/service-account-token
 
Data
====
ca.crt:     1025 bytes
namespace:  11 bytes
token:      eyJhbGciOiJSUzI1NiIsImtpZCI6IiJ9.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJhZG1pbi11c2VyLXRva2VuLWdneGY2Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQubmFtZSI6ImFkbWluLXVzZXIiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC51aWQiOiJhNGNjNzU3ZS1lNzEwLTQ5ZWEtODMyMS1kNDY0MmQzOGJiZjUiLCJzdWIiOiJzeXN0ZW06c2VydmljZWFjY291bnQ6a3ViZS1zeXN0ZW06YWRtaW4tdXNlciJ9.of88fLrICJ6o2SsnvWdCGfkpTJhaaI8aY0-G5VUcafuBQabLSYrdPsGpVSw4HKuAV1OkX3gMP63lx5I7FbLNjuxXGJqNFk9A83IqMwD2HISMNeDMsJZdtxYp_veFAFAJErr_F30pJKX4ad4FryV-LLjaxLt_xTPbZRK-8FERIUnBCa7-1-ds4WI-9qnZq4nIw5i6ws06F-J73KTGq9rYNkL91uPeGRaZEj_9Sc2XGDb6qk8XODghVYvmIIyBBJeRpYgN4384QqHIlE2GmoE8p8gRaC4K0zRrh8_PywL-bJI9NexfdH_78bJWsJBX2TmUjmnicitQGjqzg43Im3AJwQ
 
#导出令牌
[root@k8s-master01 dashboard]# vim /root/.kube/config   加
token:      eyJhbGciOiJSUzI1NiIsImtpZCI6IiJ9.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJhZG1pbi11c2VyLXRva2VuLWdneGY2Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQubmFtZSI6ImFkbWluLXVzZXIiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC51aWQiOiJhNGNjNzU3ZS1lNzEwLTQ5ZWEtODMyMS1kNDY0MmQzOGJiZjUiLCJzdWIiOiJzeXN0ZW06c2VydmljZWFjY291bnQ6a3ViZS1zeXN0ZW06YWRtaW4tdXNlciJ9.of88fLrICJ6o2SsnvWdCGfkpTJhaaI8aY0-G5VUcafuBQabLSYrdPsGpVSw4HKuAV1OkX3gMP63lx5I7FbLNjuxXGJqNFk9A83IqMwD2HISMNeDMsJZdtxYp_veFAFAJErr_F30pJKX4ad4FryV-LLjaxLt_xTPbZRK-8FERIUnBCa7-1-ds4WI-9qnZq4nIw5i6ws06F-J73KTGq9rYNkL91uPeGRaZEj_9Sc2XGDb6qk8XODghVYvmIIyBBJeRpYgN4384QqHIlE2GmoE8p8gRaC4K0zRrh8_PywL-bJI9NexfdH_78bJWsJBX2TmUjmnicitQGjqzg43Im3AJwQ
 
[root@k8s-master01 dashboard]# cp /root/.kube/config /data/tmp/admin.kubeconfig
[root@k8s-master01 dashboard]# sz /data/tmp/admin.kubeconfig

浏览器访问

1.5.4 metrics-server

metrics-server 通过 kube-apiserver 发现所有节点，然后调用 kubelet APIs（通过 https 接口）获得各节点（Node）和 Pod 的 CPU、Memory 等资源使用情况。从 Kubernetes 1.12 开始，kubernetes 的安装脚本移除了 Heapster，从 1.13 开始完全移除了对 Heapster 的支持，Heapster 不再被维护。替代方案如下：
-> 用于支持自动扩缩容的 CPU/memory HPA metrics：metrics-server；
-> 通用的监控方案：使用第三方可以获取 Prometheus 格式监控指标的监控系统，如 Prometheus Operator；
-> 事件传输：使用第三方工具来传输、归档 kubernetes events；

从 Kubernetes 1.8 开始，资源使用指标（如容器 CPU 和内存使用率）通过 Metrics API 在 Kubernetes 中获取, metrics-server 替代了heapster。Metrics Server 实现了Resource Metrics API，Metrics Server 是集群范围资源使用数据的聚合器。 Metrics Server 从每个节点上的 Kubelet 公开的 Summary API 中采集指标信息。

在了解Metrics-Server之前，必须要事先了解下Metrics API的概念。Metrics API相比于之前的监控采集方式(hepaster)是一种新的思路，官方希望核心指标的监控应该是稳定的，版本可控的，且可以直接被用户访问(例如通过使用 kubectl top 命令)，或由集群中的控制器使用(如HPA)，和其他的Kubernetes APIs一样。官方废弃heapster项目，就是为了将核心资源监控作为一等公民对待，即像pod、service那样直接通过api-server或者client直接访问，不再是安装一个hepater来汇聚且由heapster单独管理。

假设每个pod和node我们收集10个指标，从k8s的1.6开始，支持5000节点，每个节点30个pod，假设采集粒度为1分钟一次，则"10 x 5000 x 30 / 60 = 25000 平均每分钟2万多个采集指标"。因为k8s的api-server将所有的数据持久化到了etcd中，显然k8s本身不能处理这种频率的采集，而且这种监控数据变化快且都是临时数据，因此需要有一个组件单独处理他们，k8s版本只存放部分在内存中，于是metric-server的概念诞生了。其实hepaster已经有暴露了api，但是用户和Kubernetes的其他组件必须通过master proxy的方式才能访问到，且heapster的接口不像api-server一样，有完整的鉴权以及client集成。

有了Metrics Server组件，也采集到了该有的数据，也暴露了api，但因为api要统一，如何将请求到api-server的/apis/metrics请求转发给Metrics Server呢，
解决方案就是：kube-aggregator,在k8s的1.7中已经完成，之前Metrics Server一直没有面世，就是耽误在了kube-aggregator这一步。kube-aggregator（聚合api）主要提供：
-> Provide an API for registering API servers;
-> Summarize discovery information from all the servers;
-> Proxy client requests to individual servers;

Metric API的使用：
-> Metrics API 只可以查询当前的度量数据，并不保存历史数据
-> Metrics API URI 为 /apis/metrics.k8s.io/，在 k8s.io/metrics 维护
-> 必须部署 metrics-server 才能使用该 API，metrics-server 通过调用 Kubelet Summary API 获取数据

Metrics server定时从Kubelet的Summary API(类似/ap1/v1/nodes/nodename/stats/summary)采集指标信息，这些聚合过的数据将存储在内存中，且以metric-api的形式暴露出去。Metrics server复用了api-server的库来实现自己的功能，比如鉴权、版本等，为了实现将数据存放在内存中吗，去掉了默认的etcd存储，引入了内存存储（即实现Storage interface)。因为存放在内存中，因此监控数据是没有持久化的，可以通过第三方存储来拓展，这个和heapster是一致的。

Kubernetes Dashboard 还不支持 metrics-server，如果使用 metrics-server 替代 Heapster，将无法在 dashboard 中以图形展示 Pod 的内存和 CPU 情况，需要通过 Prometheus、Grafana 等监控方案来弥补。kuberntes 自带插件的 manifests yaml 文件使用 gcr.io 的 docker registry，国内被墙，需要手动替换为其它 registry 地址（本文档未替换）；可以从微软中国提供的 gcr.io 免费代理下载被墙的镜像；下面部署命令均在k8s-master01节点上执行。

监控架构

安装metrics-server

#从github clone源码：
[root@k8s-master01 tmp]# mkdir metrics
[root@k8s-master01 tmp]# cd metrics/
[root@k8s-master01 metrics]# git clone https://github.com/kubernetes-incubator/metrics-server.git
[root@k8s-master01 metrics]# cd metrics-server/deploy/1.8+/
[root@k8s-master01 1.8+]# ls
aggregated-metrics-reader.yaml  metrics-apiservice.yaml         resource-reader.yaml
auth-delegator.yaml             metrics-server-deployment.yaml
auth-reader.yaml                metrics-server-service.yaml
[root@k8s-master01 1.8+]# cp metrics-server-deployment.yaml  metrics-server-deployment.yaml.bak
[root@k8s-master01 1.8+]# vim metrics-server-deployment.yaml
[root@k8s-master01 1.8+]# diff  metrics-server-deployment.yaml  metrics-server-deployment.yaml.bak
32,38c32,33
<         image: gcr.azk8s.cn/google_containers/metrics-server-amd64:v0.3.3
<         imagePullPolicy: IfNotPresent
<         command:
<         - /metrics-server
<         - --metric-resolution=30s
<         - --kubelet-preferred-address-types=InternalIP,Hostname,InternalDNS,ExternalDNS,ExternalIP
<         - --kubelet-insecure-tls
---
>         image: k8s.gcr.io/metrics-server-amd64:v0.3.3
>         imagePullPolicy: Always

注：

这里需要注意：

--metric-resolution=30s：从 kubelet 采集数据的周期；
--kubelet-preferred-address-types：优先使用 InternalIP 来访问 kubelet，这样可以避免节点名称没有 DNS 解析记录时，通过节点名称调用节点 kubelet API 失败的情况（未配置时默认的情况）；
将metrics-server-deployment.yaml文件中的镜像拉取策略修改为"IfNotPresent"；
更改镜像来源

部署metrics-server

[root@k8s-master01 1.8+]# kubectl create -f .
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created
serviceaccount/metrics-server created
deployment.extensions/metrics-server created
service/metrics-server created
clusterrole.rbac.authorization.k8s.io/system:metrics-server created
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created

查看运行情况

[root@k8s-master01 1.8+]#  kubectl -n kube-system get pods -l k8s-app=metrics-server
NAME                              READY   STATUS    RESTARTS   AGE
metrics-server-6c49c8b6cc-6flx6   1/1     Running   0          59s
 
[root@k8s-master01 1.8+]# kubectl get svc -n kube-system  metrics-server
NAME             TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)   AGE
metrics-server   ClusterIP   10.0.94.102   <none>        443/TCP   79s

metrics-server 的命令行参数（在任意一个node节点上执行下面命令）

[root@k8s-node01 ~]# docker run -it --rm gcr.azk8s.cn/google_containers/metrics-server-amd64:v0.3.3 --help
Launch metrics-server

Usage:
[flags]

Flags:
--alsologtostderr log to standard error as well as files
--authentication-kubeconfig string kubeconfig file pointing at the 'core' kubernetes server with enough rights to create tokenaccessreviews.authentication.k8s.io.
--authentication-skip-lookup If false, the authentication-kubeconfig will be used to lookup missing authentication configuration from the cluster.
--authentication-token-webhook-cache-ttl duration The duration to cache responses from the webhook token authenticator. (default 10s)
--authentication-tolerate-lookup-failure If true, failures to look up missing authentication configuration from the cluster are not considered fatal. Note that this can result in authentication that treats all requests as anonymous.
--authorization-always-allow-paths strings A list of HTTP paths to skip during authorization, i.e. these are authorized without contacting the 'core' kubernetes server.
--authorization-kubeconfig string kubeconfig file pointing at the 'core' kubernetes server with enough rights to create subjectaccessreviews.authorization.k8s.io.
--authorization-webhook-cache-authorized-ttl duration The duration to cache 'authorized' responses from the webhook authorizer. (default 10s)
--authorization-webhook-cache-unauthorized-ttl duration The duration to cache 'unauthorized' responses from the webhook authorizer. (default 10s)
--bind-address ip The IP address on which to listen for the --secure-port port. The associated interface(s) must be reachable by the rest of the cluster, and by CLI/web clients. If blank, all interfaces will be used (0.0.0.0 for all IPv4 interfaces and :: for all IPv6 interfaces). (default 0.0.0.0)
--cert-dir string The directory where the TLS certs are located. If --tls-cert-file and --tls-private-key-file are provided, this flag will be ignored. (default "apiserver.local.config/certificates")
--client-ca-file string If set, any request presenting a client certificate signed by one of the authorities in the client-ca-file is authenticated with an identity corresponding to the CommonName of the client certificate.
--contention-profiling Enable lock contention profiling, if profiling is enabled
-h, --help help for this command
--http2-max-streams-per-connection int The limit that the server gives to clients for the maximum number of streams in an HTTP/2 connection. Zero means to use golang's default.
--kubeconfig string The path to the kubeconfig used to connect to the Kubernetes API server and the Kubelets (defaults to in-cluster config)
--kubelet-certificate-authority string Path to the CA to use to validate the Kubelet's serving certificates.
--kubelet-insecure-tls Do not verify CA of serving certificates presented by Kubelets. For testing purposes only.
--kubelet-port int The port to use to connect to Kubelets. (default 10250)
--kubelet-preferred-address-types strings The priority of node address types to use when determining which address to use to connect to a particular node (default [Hostname,InternalDNS,InternalIP,ExternalDNS,ExternalIP])
--log-flush-frequency duration Maximum number of seconds between log flushes (default 5s)
--log_backtrace_at traceLocation when logging hits line file:N, emit a stack trace (default :0)
--log_dir string If non-empty, write log files in this directory
--log_file string If non-empty, use this log file
--logtostderr log to standard error instead of files (default true)
--metric-resolution duration The resolution at which metrics-server will retain metrics. (default 1m0s)
--profiling Enable profiling via web interface host:port/debug/pprof/ (default true)
--requestheader-allowed-names strings List of client certificate common names to allow to provide usernames in headers specified by --requestheader-username-headers. If empty, any client certificate validated by the authorities in --requestheader-client-ca-file is allowed.
--requestheader-client-ca-file string Root certificate bundle to use to verify client certificates on incoming requests before trusting usernames in headers specified by --requestheader-username-headers. WARNING: generally do not depend on authorization being already done for incoming requests.
--requestheader-extra-headers-prefix strings List of request header prefixes to inspect. X-Remote-Extra- is suggested. (default [x-remote-extra-])
--requestheader-group-headers strings List of request headers to inspect for groups. X-Remote-Group is suggested. (default [x-remote-group])
--requestheader-username-headers strings List of request headers to inspect for usernames. X-Remote-User is common. (default [x-remote-user])
--secure-port int The port on which to serve HTTPS with authentication and authorization.If 0, don't serve HTTPS at all. (default 443)
--skip_headers If true, avoid header prefixes in the log messages
--stderrthreshold severity logs at or above this threshold go to stderr
--tls-cert-file string File containing the default x509 Certificate for HTTPS. (CA cert, if any, concatenated after server cert). If HTTPS serving is enabled, and --tls-cert-file and --tls-private-key-file are not provided, a self-signed certificate and key are generated for the public address and saved to the directory specified by --cert-dir.
--tls-cipher-suites strings Comma-separated list of cipher suites for the server. If omitted, the default Go cipher suites will be use. Possible values: TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_RC4_128_SHA,TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_RC4_128_SHA,TLS_RSA_WITH_3DES_EDE_CBC_SHA,TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_128_CBC_SHA256,TLS_RSA_WITH_AES_128_GCM_SHA256,TLS_RSA_WITH_AES_256_CBC_SHA,TLS_RSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_RC4_128_SHA
--tls-min-version string Minimum TLS version supported. Possible values: VersionTLS10, VersionTLS11, VersionTLS12
--tls-private-key-file string File containing the default x509 private key matching --tls-cert-file.
--tls-sni-cert-key namedCertKey A pair of x509 certificate and private key file paths, optionally suffixed with a list of domain patterns which are fully qualified domain names, possibly with prefixed wildcard segments. If no domain patterns are provided, the names of the certificate are extracted. Non-wildcard matches trump over wildcard matches, explicit domain patterns trump over extracted names. For multiple key/certificate pairs, use the --tls-sni-cert-key multiple times. Examples: "example.crt,example.key" or "foo.crt,foo.key:*.foo.com,foo.com". (default [])
-v, --v Level number for the log level verbosity
--vmodule moduleSpec comma-separated list of pattern=N settings for file-filtered logging

测试是否成功

[root@k8s-master01 1.8+]# kubectl top node
error: metrics not available yet
说明还未成功，需要等待一会
[root@k8s-master01 1.8+]# kubectl top nodes
NAME           CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
k8s-master01   292m         7%     1209Mi          64%       
k8s-master02   291m         7%     1066Mi          56%       
k8s-master03   336m         8%     1212Mi          64%       
k8s-node01     125m         3%     448Mi           23%       
k8s-node02     117m         2%     425Mi           22%       
k8s-node03     118m         2%     464Mi           24%    
[root@k8s-master01 1.8+]# kubectl top pods -n kube-system
NAME                                       CPU(cores)   MEMORY(bytes)   
calico-kube-controllers-7c4d64d599-w24xk   2m           14Mi            
calico-node-9hzdk                          48m          40Mi            
calico-node-w75b8                          43m          71Mi            
coredns-6967fb4995-ztrlt                   5m           14Mi            
etcd-k8s-master01                          81m          109Mi           
etcd-k8s-master03                          54m          90Mi            
kube-apiserver-k8s-master01                53m          368Mi           
kube-apiserver-k8s-master03                42m          331Mi           
kube-controller-manager-k8s-master01       32m          65Mi            
kube-controller-manager-k8s-master03       1m           16Mi            
kube-proxy-dtpvd                           1m           32Mi            
kube-proxy-lscgw                           2m           20Mi            
kube-scheduler-k8s-master01                2m           28Mi            
kube-scheduler-k8s-master03                3m           18Mi

浏览器访问

[root@k8s-master01 1.8+]# kubectl cluster-info
Kubernetes master is running at https://20.0.0.250:8443
KubeDNS is running at https://20.0.0.250:8443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
kubernetes-dashboard is running at https://20.0.0.250:8443/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy
Metrics-server is running at https://20.0.0.250:8443/api/v1/namespaces/kube-system/services/https:metrics-server:/proxy

 
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

1.5.5 kube-state-metrics插件

上面已经部署了metric-server，几乎容器运行的大多数指标数据都能采集到了，但是下面这种情况的指标数据的采集却无能为力：
-> 调度了多少个replicas？现在可用的有几个？
-> 多少个Pod是running/stopped/terminated状态？
-> Pod重启了多少次？
-> 当前有多少job在运行中？

这些则是kube-state-metrics提供的内容，它是K8S的一个附加服务，基于client-go开发的。它会轮询Kubernetes API，并将Kubernetes的结构化信息转换为metrics。kube-state-metrics能够采集绝大多数k8s内置资源的相关数据，例如pod、deploy、service等等。同时它也提供自己的数据，主要是资源采集个数和采集发生的异常次数统计。

kube-state-metrics 指标类别包括：

CronJob Metrics
DaemonSet Metrics
Deployment Metrics
Job Metrics
LimitRange Metrics
Node Metrics
PersistentVolume Metrics
PersistentVolumeClaim Metrics
Pod Metrics
Pod Disruption Budget Metrics
ReplicaSet Metrics
ReplicationController Metrics
ResourceQuota Metrics
Service Metrics
StatefulSet Metrics
Namespace Metrics
Horizontal Pod Autoscaler Metrics
Endpoint Metrics
Secret Metrics
ConfigMap Metrics

以pod为例的指标有：

kube_pod_info
kube_pod_owner
kube_pod_status_running
kube_pod_status_ready
kube_pod_status_scheduled
kube_pod_container_status_waiting
kube_pod_container_status_terminated_reason
..............

kube-state-metrics与metric-server (或heapster)的对比
1）metric-server是从api-server中获取cpu,内存使用率这种监控指标，并把它们发送给存储后端，如influxdb或云厂商，它当前的核心作用是：为HPA等组件提供决策指标支持。
2）kube-state-metrics关注于获取k8s各种资源的最新状态，如deployment或者daemonset，之所以没有把kube-state-metrics纳入到metric-server的能力中，是因为它们的关注点本质上是不一样的。metric-server仅仅是获取、格式化现有数据，写入特定的存储，实质上是一个监控系统。而kube-state-metrics是将k8s的运行状况在内存中做了个快照，并且获取新的指标，但它没有能力导出这些指标
3）换个角度讲，kube-state-metrics本身是metric-server的一种数据来源，虽然现在没有这么做。
4）另外，像Prometheus这种监控系统，并不会去用metric-server中的数据，它都是自己做指标收集、集成的（Prometheus包含了metric-server的能力），但Prometheus可以监控metric-server本身组件的监控状态并适时报警，这里的监控就可以通过kube-state-metrics来实现，如metric-serverpod的运行状态。

kube-state-metrics本质上是不断轮询api-server，其性能优化：
kube-state-metrics在之前的版本中暴露出两个问题：
1）/metrics接口响应慢(10-20s)
2）内存消耗太大，导致超出limit被杀掉
问题一的方案：就是基于client-go的cache tool实现本地缓存，具体结构为：var cache = map[uuid][]byte{}
问题二的的方案是：对于时间序列的字符串，是存在很多重复字符的（如namespace等前缀筛选），可以用指针或者结构化这些重复字符。

kube-state-metrics优化点和问题
1）因为kube-state-metrics是监听资源的add、delete、update事件，那么在kube-state-metrics部署之前已经运行的资源的数据是不是就拿不到了？其实kube-state-metric利用client-go可以初始化所有已经存在的资源对象，确保没有任何遗漏；
2）kube-state-metrics当前不会输出metadata信息(如help和description）；
3）缓存实现是基于golang的map，解决并发读问题当期是用了一个简单的互斥锁，应该可以解决问题，后续会考虑golang的sync.Map安全map；
4）kube-state-metrics通过比较resource version来保证event的顺序；
5）kube-state-metrics并不保证包含所有资源；

配置文件

https://github.com/kubernetes/kube-state-metrics

[root@k8s-master01 kubernetes]# ll
总用量 20
-rw-r--r-- 1 root root  362 7月  27 08:59 kube-state-metrics-cluster-role-binding.yaml
-rw-r--r-- 1 root root 1269 7月  27 08:59 kube-state-metrics-cluster-role.yaml
-rw-r--r-- 1 root root  800 7月  27 20:12 kube-state-metrics-deployment.yaml
-rw-r--r-- 1 root root   98 7月  27 08:59 kube-state-metrics-service-account.yaml
-rw-r--r-- 1 root root  421 7月  27 20:13 kube-state-metrics-service.yaml

更改配置

[root@k8s-master01 kubernetes]# fgrep -R "image" ./*
./kube-state-metrics-deployment.yaml:        image: quay.io/coreos/kube-state-metrics:v1.7.1

 
[root@k8s-master01 kubernetes]# cat kube-state-metrics-deployment.yaml
......
        image: gcr.azk8s.cn/google_containers/kube-state-metrics:v1.7.1
        imagePullPolicy: IfNotPresent


[root@k8s-master01 kubernetes]# cat kube-state-metrics-service.yaml
......
  type: NodePort
  selector:
    k8s-app: kube-state-metrics

执行并检查

#执行配置文件
[root@k8s-master01 kubernetes]# kubectl create -f .
clusterrolebinding.rbac.authorization.k8s.io/kube-state-metrics created
clusterrole.rbac.authorization.k8s.io/kube-state-metrics created
deployment.apps/kube-state-metrics created
serviceaccount/kube-state-metrics created
service/kube-state-metrics created
#检查
[root@k8s-master01 kubernetes]# kubectl get pod -n kube-system|grep kube-state-metrics
kube-state-metrics-7d5dfb9596-lds7l        1/1     Running   0          45s
[root@k8s-master01 kubernetes]# kubectl get svc -n kube-system|grep kube-state-metrics
kube-state-metrics     NodePort    10.0.112.70   <none>        8080:32672/TCP,8081:31505/TCP   75s
[root@k8s-master01 kubernetes]# kubectl get pod,svc -n kube-system|grep kube-state-metrics
 
pod/kube-state-metrics-7d5dfb9596-lds7l        1/1     Running   0          82s
service/kube-state-metrics     NodePort    10.0.112.70   <none>        8080:32672/TCP,8081:31505/TCP   83s
[root@k8s-master01 kubernetes]# curl http://20.0.0.201:32672/metrics|head -10
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0# HELP kube_certificatesigningrequest_labels Kubernetes labels converted to Prometheus labels.
# TYPE kube_certificatesigningrequest_labels gauge
# HELP kube_certificatesigningrequest_created Unix creation timestamp
# TYPE kube_certificatesigningrequest_created gauge
# HELP kube_certificatesigningrequest_condition The number of each certificatesigningrequest condition
# TYPE kube_certificatesigningrequest_condition gauge
# HELP kube_certificatesigningrequest_cert_length Length of the issued cert
# TYPE kube_certificatesigningrequest_cert_length gauge
# HELP kube_certificatesigningrequest_annotations Kubernetes annotations converted to Prometheus labels.
# TYPE kube_certificatesigningrequest_annotations gauge
100 13516    0 13516    0     0   793k      0 --:--:-- --:--:-- --:--:--  879k
curl: (23) Failed writing body (1535 != 2048)

浏览器

至此kubernetes集群部署完成

建议：init 6 --> 验证 --> 快照

1.5.6 感言

集群部署这块我是思考了又思考的，上次写了114页直接又毁了。这是一个开始，影响后面的学习。我希望自己可以写的更好！

kubernetes 1.15.1 高可用部署 -- 从零开始

第1章从零开始

1.1 前言

1.2 内容介绍

1.3 资源

1.3.1 物理机

1.3.2 虚拟机工具

1.3.3 远程链接工具

1.4 虚机

1.4.1 centos 7.6 系统安装

1.4.2 模板机优化

1.4.3 虚机准备

1.5 集群

1.5.1 部署负载均衡高可用

1.5.1.7 测试

1.5.2 部署kubernetes集群

1.5.2.1 虚机初始化

1.5.2.2 安装docker

1.5.2.3 使用kubeadm部署kubernetes

1.5.2.4 测试

1.5.3 dashboard

1.5.4 metrics-server

1.5.5 kube-state-metrics插件

1.5.6 感言

猜你喜欢

kubernetes 1.15.1 高可用部署 -- 从零开始

第1章 从零开始

1.1 前言

1.2 内容介绍

1.3 资源

1.3.1 物理机

1.3.2 虚拟机工具

1.3.3 远程链接工具

1.4 虚机

1.4.1 centos 7.6 系统安装

1.4.2 模板机优化

1.4.3 虚机准备

1.5 集群

1.5.1 部署负载均衡高可用

1.5.1.7 测试

1.5.2 部署kubernetes集群

1.5.2.1 虚机初始化

1.5.2.2 安装docker

1.5.2.3 使用kubeadm部署kubernetes

1.5.2.4 测试

1.5.3 dashboard

1.5.4 metrics-server

1.5.5 kube-state-metrics插件

1.5.6 感言

猜你喜欢

第1章从零开始