Featured image of post kubeasz 部署高可用k8s 集群

kubeasz 部署高可用k8s 集群

kubeasz 致力于提供快速部署高可用k8s集群的工具。

部署结构

类型 IP 主机名 备注
master01 192.168.3.100 master1 master1 etcd1
master02 192.168.3.101 master2 master2 etcd2
master03 192.168.3.104 master3 master3 etcd3
node01 192.168.3.102 node01 工作节点
node02 192.168.3.103 node02 工作节点
kubeasz 192.168.3.26 kubeasz 部署机器

软件清单

操作系统: Ubuntu server 20.04
k8s版本 1.30.1
calico 3.26.4
etcd 3.5.12
kubeasz 3.6.4

kubeasz 部署高可用 kubernetes

  • 部署节点基本配置

此处部署节点是:192.168.3.26,部署节点的功能如下:

1从互联⽹下载安装资源
2可选将部分镜像修改tag后上传到公司内部镜像仓库服务器
3对master进⾏初始化
4对node进⾏初始化
5后期集群维护,包括:添加及删除master节点;添加就删除node节点;etcd数据备份及恢复

安装 ansible 并进行 ssh 免密登录:

apt 安装 ansieble,并将部署节点的公钥拷贝至 master、node、etcd 节点
注: 在部署节点执行

root@kuspary:~# apt update && apt install ansible -y
# 生成密钥对,一路回车即可
root@kuspary:~# ssh-keygen

root@kuspary:~# apt install sshpass -y # 如果已经安装不需要执行
#!/bin/bash
#目标主机列表
IP="
192.168.3.100
192.168.3.101
192.168.3.102
192.168.3.103
192.168.3.104
"
REMOTE_PORT="22"
REMOTE_USER="root"
REMOTE_PASS="root"
for REMOTE_HOST in ${IP};do
  REMOTE_CMD="echo ${REMOTE_HOST} is successfully!"
  ssh-keyscan -p "${REMOTE_PORT}" "${REMOTE_HOST}" >> ~/.ssh/known_hosts   #在本地添加远程主机的公钥信息,避免交互式应答
  sshpass -p "${REMOTE_PASS}" ssh-copy-id "${REMOTE_USER}@${REMOTE_HOST}"
  if [ $? -eq 0 ];then
    echo ${REMOTE_HOST} 免秘钥配置完成!
    ssh ${REMOTE_HOST} ln -sv /usr/bin/python3 /usr/bin/python
  else
    echo "免密钥配置失败!"
  fi
done


# 验证免密登录
ssh 192.168.3.100
  • 下载 kubeasz 项⽬及组件
root@kubeasz:/data/kubeasz# wget https://github.com/easzlab/kubeasz/releases/download/3.6.4/ezdown

root@kubeasz:/data/kubeasz# chmod +x ezdown
#下载kubeasz代码、二进制、默认容器镜像,会运行一个redistry镜像仓库,将下载的镜像push到仓库
root@kubeasz:/data/kubeasz# ./ezdown -D

root@kubeasz:/data/kubeasz# ll /etc/kubeasz/  #kubeasz所有文件和配置路径
total 140
drwxrwxr-x  12 root root  4096 Jun 28 07:35 ./
drwxr-xr-x 108 root root  4096 Jun 28 07:40 ../
-rw-rw-r--   1 root root 20304 May 22 15:09 ansible.cfg
drwxr-xr-x   5 root root  4096 Jun 28 07:35 bin/
drwxrwxr-x   8 root root  4096 Jun 23 15:08 docs/
drwxr-xr-x   3 root root  4096 Jun 28 07:45 down/
drwxrwxr-x   2 root root  4096 Jun 23 15:08 example/
-rwxrwxr-x   1 root root 26507 May 22 15:09 ezctl*
-rwxrwxr-x   1 root root 32390 May 22 15:09 ezdown*
drwxrwxr-x   4 root root  4096 Jun 23 15:08 .github/
-rw-rw-r--   1 root root   301 May 22 15:09 .gitignore
drwxrwxr-x  10 root root  4096 Jun 23 15:08 manifests/
drwxrwxr-x   2 root root  4096 Jun 23 15:08 pics/
drwxrwxr-x   2 root root  4096 Jun 23 15:08 playbooks/
-rw-rw-r--   1 root root  6349 May 22 15:09 README.md
drwxrwxr-x  22 root root  4096 Jun 23 15:08 roles/
drwxrwxr-x   2 root root  4096 Jun 23 15:08 tools/
  • 部署集群
root@kubeasz:/data/kubeasz# cd /etc/kubeasz/
root@kubeasz:/etc/kubeasz# ./ezctl new k8s-cluster01    # 新建管理集群
2024-06-28 07:58:21 DEBUG generate custom cluster files in /etc/kubeasz/clusters/k8s-cluster01 #集群使用相关配置路径
2024-06-28 07:58:21 DEBUG set versions
2024-06-28 07:58:21 DEBUG cluster k8s-cluster01: files successfully created.
2024-06-28 07:58:21 INFO next steps 1: to config '/etc/kubeasz/clusters/k8s-cluster01/hosts'  #ansible hosts文件
2024-06-28 07:58:21 INFO next steps 2: to config '/etc/kubeasz/clusters/k8s-cluster01/config.yml' #ansible yaml文件

配置用于集群管理的 ansible hosts 文件

root@kubeasz:/etc/kubeasz/clusters/k8s-cluster01# pwd
/etc/kubeasz/clusters/k8s-cluster01
root@kubeasz:/etc/kubeasz/clusters/k8s-cluster01# ll #注意,这两个文件至关重要,任何小错误都会导致集群有问题
total 20
drwxr-xr-x 2 root root 4096 Jun 28 07:58 ./
drwxr-xr-x 3 root root 4096 Jun 28 07:58 ../
-rw-r--r-- 1 root root 7615 Jun 28 07:58 config.yml
-rw-r--r-- 1 root root 2381 Jun 28 07:58 hosts
root@kubeasz:/etc/kubeasz/clusters/k8s-cluster01#
# 修改hosts文件
vim /etc/kubeasz/clusters/k8s-cluster01/hosts

# 'etcd' cluster should have odd member(s) (1,3,5,...)
[etcd]
192.168.3.100
192.168.3.101
192.168.3.104

# master node(s), set unique 'k8s_nodename' for each node
# CAUTION: 'k8s_nodename' must consist of lower case alphanumeric characters, '-' or '.',
# and must start and end with an alphanumeric character
[kube_master]
192.168.3.100 k8s_nodename='master01'
192.168.3.101 k8s_nodename='master02'
192.168.3.104 k8s_nodename='master03'

# work node(s), set unique 'k8s_nodename' for each node
# CAUTION: 'k8s_nodename' must consist of lower case alphanumeric characters, '-' or '.',
# and must start and end with an alphanumeric character
[kube_node]
192.168.3.102 k8s_nodename='node01'
192.168.3.103 k8s_nodename='node02'

# [optional] harbor server, a private docker registry
# 'NEW_INSTALL': 'true' to install a harbor server; 'false' to integrate with existed one
[harbor]
#192.168.1.8 NEW_INSTALL=false

# [optional] loadbalance for accessing k8s from outside
[ex_lb]
#192.168.1.6 LB_ROLE=backup EX_APISERVER_VIP=192.168.1.250 EX_APISERVER_PORT=8443
#192.168.1.7 LB_ROLE=master EX_APISERVER_VIP=192.168.1.250 EX_APISERVER_PORT=8443

# [optional] ntp server for the cluster
[chrony]
#192.168.1.1


[all:vars]
# --------- Main Variables ---------------
# Secure port for apiservers
SECURE_PORT="6443"

# Cluster container-runtime supported: docker, containerd
# if k8s version >= 1.24, docker is not supported
CONTAINER_RUNTIME="containerd"

# Network plugins supported: calico, flannel, kube-router, cilium, kube-ovn
CLUSTER_NETWORK="calico"

# Service proxy mode of kube-proxy: 'iptables' or 'ipvs'
PROXY_MODE="ipvs"

# K8S Service CIDR, not overlap with node(host) networking
SERVICE_CIDR="10.68.0.0/16"

# Cluster CIDR (Pod CIDR), not overlap with node(host) networking
CLUSTER_CIDR="172.20.0.0/16"

# NodePort Range
NODE_PORT_RANGE="30000-32767"

# Cluster DNS Domain
CLUSTER_DNS_DOMAIN="cluster.local"

# -------- Additional Variables (don't change the default value right now) ---
# Binaries Directory
bin_dir="/opt/kube/bin"


# Deploy Directory (kubeasz workspace)
base_dir="/etc/kubeasz"

# Directory for a specific cluster
cluster_dir="{{ base_dir }}/clusters/k8s-cluster01"

# CA and other components cert/key Directory
ca_dir="/etc/kubernetes/ssl"

# Default 'k8s_nodename' is empty
k8s_nodename=''

# Default python interpreter
ansible_python_interpreter=/usr/bin/python

config.yml 是用于配置 K8S 集群的具体配置

vim /etc/kubeasz/clusters/k8s-cluster01/config.yml #修改为以下配置,注意修改master的IP以及DNS相关配置、images配置等

############################
# prepare
############################
# 可选离线安装系统软件包 (offline|online)
INSTALL_SOURCE: "online"

# 可选进行系统安全加固 github.com/dev-sec/ansible-collection-hardening
# (deprecated) 未更新上游项目,未验证最新k8s集群安装,不建议启用
OS_HARDEN: false


############################
# role:deploy
############################
# default: ca will expire in 100 years
# default: certs issued by the ca will expire in 50 years
CA_EXPIRY: "876000h"
CERT_EXPIRY: "876000h"

# force to recreate CA and other certs, not suggested to set 'true'
CHANGE_CA: false

# kubeconfig 配置参数
CLUSTER_NAME: "cluster1"
CONTEXT_NAME: "context-{{ CLUSTER_NAME }}"

# k8s version
K8S_VER: "1.30.1"

# set unique 'k8s_nodename' for each node, if not set(default:'') ip add will be used
# CAUTION: 'k8s_nodename' must consist of lower case alphanumeric characters, '-' or '.',
# and must start and end with an alphanumeric character (e.g. 'example.com'),
# regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*'
K8S_NODENAME: "{%- if k8s_nodename != '' -%} \
                    {{ k8s_nodename|replace('_', '-')|lower }} \
               {%- else -%} \
                    k8s-{{ inventory_hostname|replace('.', '-') }} \
               {%- endif -%}"

# use 'K8S_NODENAME' to set hostname
ENABLE_SETTING_HOSTNAME: true


############################
# role:etcd
############################
# 设置不同的wal目录,可以避免磁盘io竞争,提高性能
ETCD_DATA_DIR: "/var/lib/etcd"
ETCD_WAL_DIR: ""


############################
# role:runtime [containerd,docker]
############################
# [.]启用拉取加速镜像仓库
ENABLE_MIRROR_REGISTRY: true

# [.]添加信任的私有仓库
# 必须按照如下示例格式,协议头'http://'和'https://'不能省略
INSECURE_REG:
  - "http://easzlab.io.local:5000"

# [.]基础容器镜像
SANDBOX_IMAGE: "easzlab.io.local:5000/easzlab/pause:3.9"

# [containerd]容器持久化存储目录
CONTAINERD_STORAGE_DIR: "/var/lib/containerd"

# [docker]容器存储目录
DOCKER_STORAGE_DIR: "/var/lib/docker"

# [docker]开启Restful API
DOCKER_ENABLE_REMOTE_API: false


############################
# role:kube-master
############################
# k8s 集群 master 节点证书配置,可以添加多个ip和域名(比如增加公网ip和域名)
MASTER_CERT_HOSTS:
  - "192.168.3.100"
  - "192.168.3.101"
  - "192.168.3.104"
  - "master1"
  - "master2"
  - "master3"
  #- "www.test.com"

# node 节点上 pod 网段掩码长度(决定每个节点最多能分配的pod ip地址)
# 如果flannel 使用 --kube-subnet-mgr 参数,那么它将读取该设置为每个节点分配pod网段
# https://github.com/coreos/flannel/issues/847
NODE_CIDR_LEN: 24


############################
# role:kube-node
############################
# Kubelet 根目录
KUBELET_ROOT_DIR: "/var/lib/kubelet"

# node节点最大pod 数
MAX_PODS: 110

# 配置为kube组件(kubelet,kube-proxy,dockerd等)预留的资源量
# 数值设置详见templates/kubelet-config.yaml.j2
KUBE_RESERVED_ENABLED: "no"

# k8s 官方不建议草率开启 system-reserved, 除非你基于长期监控,了解系统的资源占用状况;
# 并且随着系统运行时间,需要适当增加资源预留,数值设置详见templates/kubelet-config.yaml.j2
# 系统预留设置基于 4c/8g 虚机,最小化安装系统服务,如果使用高性能物理机可以适当增加预留
# 另外,集群安装时候apiserver等资源占用会短时较大,建议至少预留1g内存
SYS_RESERVED_ENABLED: "no"


############################
# role:network [flannel,calico,cilium,kube-ovn,kube-router]
############################
# ------------------------------------------- flannel
# [flannel]设置flannel 后端"host-gw","vxlan"等
FLANNEL_BACKEND: "vxlan"
DIRECT_ROUTING: false

# [flannel]
flannel_ver: "v0.22.2"

# ------------------------------------------- calico
# [calico] IPIP隧道模式可选项有: [Always, CrossSubnet, Never],跨子网可以配置为Always与CrossSubnet(公有云建议使用always比较省事,其他的话需要修改各自公有云的网络配置,具体可以参考各个公有云说明)
# 其次CrossSubnet为隧道+BGP路由混合模式可以提升网络性能,同子网配置为Never即可.
CALICO_IPV4POOL_IPIP: "Always"

# [calico]设置 calico-node使用的host IP,bgp邻居通过该地址建立,可手工指定也可以自动发现
IP_AUTODETECTION_METHOD: "can-reach={{ groups['kube_master'][0] }}"

# [calico]设置calico 网络 backend: bird, vxlan, none
CALICO_NETWORKING_BACKEND: "bird"

# [calico]设置calico 是否使用route reflectors
# 如果集群规模超过50个节点,建议启用该特性
CALICO_RR_ENABLED: false

# CALICO_RR_NODES 配置route reflectors的节点,如果未设置默认使用集群master节点
# CALICO_RR_NODES: ["192.168.1.1", "192.168.1.2"]
CALICO_RR_NODES: []

# [calico]更新支持calico 版本: ["3.19", "3.23"]
calico_ver: "v3.26.4"

# [calico]calico 主版本
calico_ver_main: "{{ calico_ver.split('.')[0] }}.{{ calico_ver.split('.')[1] }}"

# ------------------------------------------- cilium
# [cilium]镜像版本
cilium_ver: "1.15.5"
cilium_connectivity_check: true
cilium_hubble_enabled: false
cilium_hubble_ui_enabled: false

# ------------------------------------------- kube-ovn
# [kube-ovn]离线镜像tar包
kube_ovn_ver: "v1.11.5"

# ------------------------------------------- kube-router
# [kube-router]公有云上存在限制,一般需要始终开启 ipinip;自有环境可以设置为 "subnet"
OVERLAY_TYPE: "full"

# [kube-router]NetworkPolicy 支持开关
FIREWALL_ENABLE: true

# [kube-router]kube-router 镜像版本
kube_router_ver: "v1.5.4"


############################
# role:cluster-addon
############################
# coredns 自动安装
dns_install: "yes"
corednsVer: "1.11.1"
ENABLE_LOCAL_DNS_CACHE: true
dnsNodeCacheVer: "1.22.28"
# 设置 local dns cache 地址
LOCAL_DNS_CACHE: "169.254.20.10"

# metric server 自动安装
metricsserver_install: "yes"
metricsVer: "v0.7.1"

# dashboard 自动安装
dashboard_install: "yes"
dashboardVer: "v2.7.0"
dashboardMetricsScraperVer: "v1.0.8"

# prometheus 自动安装
prom_install: "no"
prom_namespace: "monitor"
prom_chart_ver: "45.23.0"

# kubeapps 自动安装,如果选择安装,默认同时安装local-storage(提供storageClass: "local-path")
kubeapps_install: "no"
kubeapps_install_namespace: "kubeapps"
kubeapps_working_namespace: "default"
kubeapps_storage_class: "local-path"
kubeapps_chart_ver: "12.4.3"

# local-storage (local-path-provisioner) 自动安装
local_path_provisioner_install: "no"
local_path_provisioner_ver: "v0.0.26"
# 设置默认本地存储路径
local_path_provisioner_dir: "/opt/local-path-provisioner"

# nfs-provisioner 自动安装
nfs_provisioner_install: "no"
nfs_provisioner_namespace: "kube-system"
nfs_provisioner_ver: "v4.0.2"
nfs_storage_class: "managed-nfs-storage"
nfs_server: "192.168.1.10"
nfs_path: "/data/nfs"

# network-check 自动安装
network_check_enabled: false
network_check_schedule: "*/5 * * * *"

############################
# role:harbor
############################
# harbor version,完整版本号
HARBOR_VER: "v2.10.2"
HARBOR_DOMAIN: "harbor.easzlab.io.local"
HARBOR_PATH: /var/data
HARBOR_TLS_PORT: 8443
HARBOR_REGISTRY: "{{ HARBOR_DOMAIN }}:{{ HARBOR_TLS_PORT }}"

# if set 'false', you need to put certs named harbor.pem and harbor-key.pem in directory 'down'
HARBOR_SELF_SIGNED_CERT: true

# install extra component
HARBOR_WITH_TRIVY: false
  • 部署
root@kubeasz:/etc/kubeasz# ./ezctl --help #查看具体安装步骤,依次按顺序执行来部署k8s各组件,具体每步含义可参考官网
Usage: ezctl COMMAND [args]
-------------------------------------------------------------------------------------
Cluster setups:
    list                   to list all of the managed clusters
    checkout    <cluster>            to switch default kubeconfig of the cluster
    new         <cluster>            to start a new k8s deploy with name 'cluster'
    setup       <cluster>  <step>    to setup a cluster, also supporting a step-by-step way
    start       <cluster>            to start all of the k8s services stopped by 'ezctl stop'
    stop        <cluster>            to stop all of the k8s services temporarily
    upgrade     <cluster>            to upgrade the k8s cluster
    destroy     <cluster>            to destroy the k8s cluster
    backup      <cluster>            to backup the cluster state (etcd snapshot)
    restore     <cluster>            to restore the cluster state from backups
    start-aio              to quickly setup an all-in-one cluster with default settings

Cluster ops:
    add-etcd    <cluster>  <ip>      to add a etcd-node to the etcd cluster
    add-master  <cluster>  <ip>      to add a master node to the k8s cluster
    add-node    <cluster>  <ip>      to add a work node to the k8s cluster
    del-etcd    <cluster>  <ip>      to delete a etcd-node from the etcd cluster
    del-master  <cluster>  <ip>      to delete a master node from the k8s cluster
    del-node    <cluster>  <ip>      to delete a work node from the k8s cluster

Extra operation:
    kca-renew   <cluster>            to force renew CA certs and all the other certs (with caution)
    kcfg-adm    <cluster>  <args>    to manage client kubeconfig of the k8s cluster

Use "ezctl help <command>" for more information about a given command.
./ezctl setup k8s-cluster01 all   # 安装第一步到第七步

等待安装完成。

  • 获取集群节点信息
# 部署节点执行
kubectl get nodes
NAME       STATUS                     ROLES    AGE   VERSION
master01   Ready,SchedulingDisabled   master   20m   v1.30.1
master02   Ready,SchedulingDisabled   master   20m   v1.30.1
master03   Ready,SchedulingDisabled   master   20m   v1.30.1
node01     Ready                      node     20m   v1.30.1
node02     Ready                      node     20m   v1.30.1

将部署节点 ~/.kube 目录下的文件拷贝到 master1 master2 master3 的 ~/.kube 目录下,并验证.

scp -r /root/.kube/* 192.168.3.100:/root/.kube
scp -r /root/.kube/* 192.168.3.101:/root/.kube
scp -r /root/.kube/* 192.168.3.104:/root/.kube
  • master1
root@master01:~# kubectl get nodes
NAME       STATUS                     ROLES    AGE     VERSION
master01   Ready,SchedulingDisabled   master   9m37s   v1.30.1
master02   Ready,SchedulingDisabled   master   9m37s   v1.30.1
master03   Ready,SchedulingDisabled   master   9m37s   v1.30.1
node01     Ready                      node     8m56s   v1.30.1
node02     Ready                      node     8m56s   v1.30.1
  • master2
root@master02:~# kubectl get nodes
NAME       STATUS                     ROLES    AGE     VERSION
master01   Ready,SchedulingDisabled   master   10m     v1.30.1
master02   Ready,SchedulingDisabled   master   10m     v1.30.1
master03   Ready,SchedulingDisabled   master   10m     v1.30.1
node01     Ready                      node     9m45s   v1.30.1
node02     Ready                      node     9m45s   v1.30.1
  • master3
root@master03:~# kubectl get nodes
NAME       STATUS                     ROLES    AGE   VERSION
master01   Ready,SchedulingDisabled   master   11m   v1.30.1
master02   Ready,SchedulingDisabled   master   11m   v1.30.1
master03   Ready,SchedulingDisabled   master   11m   v1.30.1
node01     Ready                      node     10m   v1.30.1
node02     Ready                      node     10m   v1.30.1
  • 修改 config 配置文件

从部署节点拷贝的 config 配置文件中 server 字段的值指定了三个 master 节点之中的其中一个,保证高可用需要修改值,修改成

https://127.0.0.1:6443

kubeasz 部署的集群会在每个节点部署一个 kube-lb,查看/etc/kube-lb/conf/kube-lb.conf

cat /etc/kube-lb/conf/kube-lb.conf
user root;
worker_processes 1;

error_log  /etc/kube-lb/logs/error.log warn;

events {
    worker_connections  3000;
}

stream {
    upstream backend {
        server 192.168.3.100:6443    max_fails=2 fail_timeout=3s;
        server 192.168.3.101:6443    max_fails=2 fail_timeout=3s;
        server 192.168.3.104:6443    max_fails=2 fail_timeout=3s;
    }

    server {
        listen 127.0.0.1:6443;
        proxy_connect_timeout 1s;
        proxy_pass backend;
    }
}
# 所有master节点都需要修改
# 修改前
server: https://192.168.3.100:6443
# 修改后
server: https://127.0.0.1:6443

# 验证是否能访问api
root@master01:~/.kube# kubectl get nodes
NAME       STATUS                     ROLES    AGE   VERSION
master01   Ready,SchedulingDisabled   master   22m   v1.30.1
master02   Ready,SchedulingDisabled   master   22m   v1.30.1
master03   Ready,SchedulingDisabled   master   22m   v1.30.1
node01     Ready                      node     22m   v1.30.1
node02     Ready                      node     22m   v1.30.1
  • 增加 master 节点 操作步骤 执行如下 (假设待增加节点为 192.168.3.128, 集群名称 k8s-cluster01):
# 部署节点执行
# ssh 免密码登录
$ ssh-copy-id 192.168.3.128

# 新增节点
$ ezctl add-master k8s-cluster01 192.168.3.128

# 同理,重复上面步骤再新增节点并自定义nodename
$ ezctl add-master k8s-cluster01 192.168.3.128 k8s_nodename=master-03
  • 增加 kube_node 节点

操作步骤
执行如下 (假设待增加节点为 192.168.3.129,k8s集群名为 k8s-cluster01)
# ssh 免密码登录
# 部署节点执行
$ ssh-copy-id 192.168.3.129

# 新增节点
$ ezctl add-node test-k8s 192.168.3.129

# 同理,重复上面步骤再新增节点并自定义nodename
$ ezctl add-node test-k8s 192.168.1.129 k8s_nodename=worker-03