之前已經用 docker 來封裝一些非常重的工作任務,像是 build fw 等。現在來試試看 kubeadm 這個工具,將維護整套系統的維度從 docker 轉進到 Kubernetes cluster,往後可以靠 k8s 來維護算力資源,像是動態調配算力單元等等。這些感覺滿像十多年前在 AWS 靠 autoscaling 做的事,真是熟悉的陌生人。
這篇僅處理在 Ubuntu 16.04 安裝 Kubeadm 後的啟動問題,並沒有處理其他使用細節,包括建立 node server 、 連上即加入 master server 等。
環境簡介:
$ lsb_release -aNo LSB modules are available.Distributor ID: UbuntuDescription: Ubuntu 16.04.6 LTSRelease: 16.04Codename: xenial$ curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -$ echo "deb http://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee -a /etc/apt/sources.list.d/kubernetes.list$ sudo apt update$ sudo apt install kubeadm$ sudo apt-mark hold kubelet kubeadm kubectl$ dpkg -l | grep kubeii kubeadm 1.28.2-00 amd64 Kubernetes Cluster Bootstrapping Toolii kubectl 1.28.2-00 amd64 Kubernetes Command Line Toolii kubelet 1.28.2-00 amd64 Kubernetes Node Agentii kubernetes-cni 1.2.0-00 amd64 Kubernetes CNI
接著:
$ sudo kubeadm init --v=5...validating the existence and emptiness of directory /var/lib/etcd[preflight] Some fatal errors occurred:[ERROR CRI]: container runtime is not running: output: level=fatal msg="validate service connection: CRI v1 runtime API is not implemented for endpoint \"unix:///var/run/containerd/containerd.sock\": rpc error: code = Unimplemented desc = unknown service runtime.v1.RuntimeService"
發現有些問題,進行排除研究,部分資訊推導應當跟 docker , containerd 版本有高度相關,就先把 docker 跟 containerd 盡可能升級上去:
$ dpkg -l | grep containerdii containerd 1.2.6-0ubuntu1~16.04.6+esm1 amd64 daemon to control runC$ dpkg -l | grep dockerrc docker 1.5-1 amd64 System tray for KDE3/GNOME2 docklet applicationsii docker.io 18.09.7-0ubuntu1~16.04.7 amd64 Linux container runtime$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg$ echo "deb [arch=amd64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null$ sudo apt update$ sudo apt install docker-ce docker-ce-cli containerd.io$ sudo docker versionClient: Docker Engine - CommunityVersion: 20.10.7API version: 1.41Go version: go1.13.15Git commit: f0df350Built: Wed Jun 2 11:56:47 2021OS/Arch: linux/amd64Context: defaultExperimental: trueServer: Docker Engine - CommunityEngine:Version: 20.10.7API version: 1.41 (minimum version 1.12)Go version: go1.13.15Git commit: b0f5bc3Built: Wed Jun 2 11:54:58 2021OS/Arch: linux/amd64Experimental: falsecontainerd:Version: 1.4.6GitCommit: d71fcd7d8303cbf684402823e425e9dd2e99285drunc:Version: 1.0.0-rc95GitCommit: b9ee9c6314599f1b4a7f497e1f1f856fe433d3b7docker-init:Version: 0.19.0GitCommit: de40ad0
接著追蹤可能是 cri 插件的部分,試著排除:
$ cat /etc/containerd/config.toml | grep crienabled_plugins = ["cri"]
無效,繼續努力:
$ sudo mv /etc/containerd/config.toml /etc/containerd/config.toml.bak$ containerd config default | sudo tee /etc/containerd/config.toml$ sudo systemctl restart containerd$ containerd config default | grep containerd.sockaddress = "/run/containerd/containerd.sock"
接著在試著 kubeadm init 還是有一樣的問題,查詢了細節滿有可能是 containerd 版本還是太舊了,有個關鍵資訊是說 1.6 版本以前會缺乏溝通介面
$ dpkg -L containerd.io | grep bin/usr/bin/usr/bin/containerd-shim-runc-v2/usr/bin/containerd-shim/usr/bin/containerd/usr/bin/runc/usr/bin/ctr/usr/bin/containerd-shim-runc-v1
直接到 containerd.io 官網下載最新版 1.7.11 版的 binary 方案:
$ wget https://github.com/containerd/containerd/releases/download/v1.7.11/containerd-1.7.11-linux-amd64.tar.gz$ tar xvf containerd-1.7.11-linux-amd64.tar.gzl$ tar -tzvf containerd-1.7.11-linux-amd64.tar.gzdrwxr-xr-x root/root 0 2023-12-09 07:41 bin/-rwxr-xr-x root/root 12185600 2023-12-09 07:41 bin/containerd-shim-runc-v2-rwxr-xr-x root/root 28330360 2023-12-09 07:41 bin/ctr-rwxr-xr-x root/root 7061504 2023-12-09 07:41 bin/containerd-shim-rwxr-xr-x root/root 8761344 2023-12-09 07:41 bin/containerd-shim-runc-v1-rwxr-xr-x root/root 26184312 2023-12-09 07:41 bin/containerd-stress-rwxr-xr-x root/root 55551616 2023-12-09 07:41 bin/containerd
處理一下系統內部的:
$ sudo systemctl stop containerd$ sudo mkdir -p /usr/bin/containerd-1.4.6$ sudo mv /usr/bin/containerd* /usr/bin/containerd-1.4.6/$ sudo mv /usr/bin/ctr /usr/bin/containerd-1.4.6/$ tree /usr/bin/containerd-1.4.6//usr/bin/containerd-1.4.6/├── containerd├── containerd-shim├── containerd-shim-runc-v1├── containerd-shim-runc-v2└── ctr0 directories, 5 files$ sudo cp ~/bin/c* /usr/bin/
準備重新啟動:
$ containerd --versioncontainerd github.com/containerd/containerd v1.7.11 64b8a811b07ba6288238eefc14d898ee0b5b99ba$ containerd config default | sudo tee /etc/containerd/config.toml$ sudo systemctl stop containerd$ sudo systemctl start containerd$ sudo systemctl status containerd● containerd.service - containerd container runtimeLoaded: loaded (/lib/systemd/system/containerd.service; enabled; vendor preset: enabled)Active: active (running); 14min agoDocs: https://containerd.ioProcess: 19396 ExecStartPre=/sbin/modprobe overlay (code=exited, status=0/SUCCESS)Main PID: 19406 (containerd)Tasks: 32Memory: 24.5MCPU: 187msCGroup: /system.slice/containerd.service└─19406 /usr/bin/containerd$ sudo systemctl stop docker$ sudo systemctl start docker$ sudo docker versionClient: Docker Engine - CommunityVersion: 20.10.7API version: 1.41Go version: go1.13.15Git commit: f0df350Built: Wed Jun 2 11:56:47 2021OS/Arch: linux/amd64Context: defaultExperimental: trueServer: Docker Engine - CommunityEngine:Version: 20.10.7API version: 1.41 (minimum version 1.12)Go version: go1.13.15Git commit: b0f5bc3Built: Wed Jun 2 11:54:58 2021OS/Arch: linux/amd64Experimental: falsecontainerd:Version: v1.7.11GitCommit: 64b8a811b07ba6288238eefc14d898ee0b5b99barunc:Version: 1.0.0-rc95GitCommit: b9ee9c6314599f1b4a7f497e1f1f856fe433d3b7docker-init:Version: 0.19.0GitCommit: de40ad0
終於讓 docker version 也認到 containerd v1.7.11 了,接著就可以回到 kubeadm 啦
$ sudo kubeadm init --v=5....Your Kubernetes control-plane has initialized successfully!To start using your cluster, you need to run the following as a regular user:mkdir -p $HOME/.kubesudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/configsudo chown $(id -u):$(id -g) $HOME/.kube/configAlternatively, if you are the root user, you can run:export KUBECONFIG=/etc/kubernetes/admin.confYou should now deploy a pod network to the cluster.Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:https://kubernetes.io/docs/concepts/cluster-administration/addons/Then you can join any number of worker nodes by running the following on each as root:kubeadm join ip:6443 --token ###### --discovery-token-ca-cert-hash sha256:######
此外,kubernetes 本身是建議關閉 swap 的使用來確保整體性能,由於我是在一台本身就有 swap 的機器上運行,由於不能關閉 swap ,只好設法去略過 swap 的檢查 (增加 --fail-swap-on=false ):
$ cat /etc/systemd/system/kubelet.service.d/10-kubeadm.conf | grep ExecStartExecStart=ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS --fail-swap-on=false
相關資訊:
沒有留言:
張貼留言