2024年2月21日 星期三

Python 開發筆記 - 使用 Selenium / undetected_chromedriver / ChatGoogleGenerativeAI / gemini-pro-vision 完成自動登入網站的流程(含 retry 驗證碼架構)

有些工作任務需要去下載表單做一些自動化應用,因此有了要自動登入的需求,當然也會碰到認證碼辨識問題。此篇是延續 Python 開發筆記 - 使用 Google AI, Generative Language API, gemini-pro-vision 辨識圖片認證碼

整個處理原理:
  1. 使用 Selenium 去偵測網頁的狀態,取得登入要用的帳號, 密碼, 認證碼圖片, 認證碼數值, 登入按鈕
  2. 使用 ChatGoogleGenerativeAI/gemini-pro-vision 分析圖片內容,設法分析出認證碼數值
  3. 觸發 登入按鈕 送出表單
  4. 檢視登入流程,檢查是否有登入失敗的訊息,或是反過來思考怎樣判斷登入成功,若登入失敗重回 (1) 去取得新的認證碼圖片 
引入的函式庫:

import getpass
import os
import sys
import time
import json
import base64
import undetected_chromedriver as uc
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
 
from langchain_core.messages import HumanMessage
from langchain_google_genai import ChatGoogleGenerativeAI

先採用 undetected_chromedriver 來包裝一下取得 browser driver:

def getBrowserDriver():
    option = uc.ChromeOptions()
    option.add_argument('--user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36')
    #option.add_argument('--window-size=%d,%d' % self.res)
    #option.add_argument('--headless')
    driver = uc.Chrome(options=option)
    return driver

辨別圖片文字靠 ChatGoogleGenerativeAI model="gemini-pro-vision":

def codeDetection(imageBase64URL: str):
    llm = ChatGoogleGenerativeAI(model="gemini-pro-vision")
    message = HumanMessage(
        content=[
            {
                "type": "text",
                "text": "Please identify the English or numbers appearing in the image. The output format is 'The answer is: XXXX'"
,
            },
            {"type": "image_url", "image_url": imageBase64URL},
        ]
    )
    result = llm.invoke([message])
    return result

處理流程:

if __name__ == '__main__':
    if "GOOGLE_API_KEY" not in os.environ:
        os.environ["GOOGLE_API_KEY"] = getpass.getpass("Provide your Google API Key: ")
    if not os.environ["GOOGLE_API_KEY"]:
        print('ERROR, no GOOGLE_API_KEY info')
        sys.exit(1)

    output = {
        'status': False,
        'time': [],
    }

    browser = getBrowserDriver()
    start_time = time.time()

    browser.get(LOGIN_URL)

    # 15s timeout
    wait = WebDriverWait(browser, 15)

    # 等待關鍵的表單資料
    conditions = [
        EC.presence_of_element_located((By.ID, "input_user")),
        EC.presence_of_element_located((By.ID, "input_password")),
        EC.presence_of_element_located((By.ID, "input_velidation_code")),
        EC.presence_of_element_located((By.ID, "velidation_code_image")),
        EC.presence_of_element_located((By.ID, "login_button")),
    ]
    if wait.until(lambda driver: all(condition(driver) for condition in conditions)):
        output['status'] = True
    output['time'].append( time.time() - start_time )

    # 取得圖片元素
    imageElement = wait.until(EC.presence_of_element_located((By.ID, "velidation_code_image")))

    # 取得圖片的 HTML code
    imageHTMLCode = imageElement.get_attribute("outerHTML")
    print("Image HTML Code:", imageHTMLCode)

    # 取得圖片的 URL
    imageSrcURL = imageElement.get_attribute("src")
    print("Image URL:", imageSrcURL)

    # 透過 JavaScript 監聽 src 屬性變化
    script = f"""
        var target = document.getElementById('velidation_code_image');
        var observer = new MutationObserver(function(mutations) {{
            mutations.forEach(function(mutation) {{
                if (mutation.attributeName === 'src') {{
                    console.log('src attribute changed:', target.getAttribute('src'));
                }}
            }});
        }});
    
        var config = {{ attributes: true }};
        observer.observe(target, config);
    """
    
    # 執行 JavaScript 代碼
    browser.execute_script(script)

    # 等待一段時間,確保有足夠的時間監聽 src 屬性的變化
    time.sleep(5)

    # 取得更新後的圖片的 URL
    updatedImageSrc = imageElement.get_attribute("src")
    print("Updated Image URL:", updatedImageSrc)

    if not updatedImageSrc:
        print('ERROR, velidation_code_image not found')
        sys.exit(1)

    result = codeDetection(updatedImageSrc)
    print(result.content)
    loginCode = ''
    for c in result.content.split(':', 2)[1]:
        if c == '' or c == ' ':
            continue
        loginCode += c

    print(f"LoginCode: {loginCode}")
    element = wait.until(EC.presence_of_element_located((By.ID, "input_user")))
    element.send_keys('YourAccountName')
    element = wait.until(EC.presence_of_element_located((By.ID, "input_password")))
    element.send_keys('YourPassword')
    element = wait.until(EC.presence_of_element_located((By.ID, "input_velidation_code")))
    element.send_keys(loginCode)
    element = wait.until(EC.presence_of_element_located((By.ID, "login_button")))

    start_time = time.time()
    element.click()

    loginDone = False
    loginRetry = 0
    while loginDone == False and loginRetry <= 3:
        try:
            wait = WebDriverWait(browser, 5)
            element = wait.until(EC.presence_of_element_located((By.ID, "WebsiteErrorMessage")))
            div_element = element.find_element(By.TAG_NAME, "div")
            span_element = div_element.find_element(By.TAG_NAME, "span")
            inner_html = span_element.get_attribute('innerHTML')
            # 驗證碼輸入錯誤
            print(f"retry: {loginRetry}, inner HTML: {inner_html}")

            # 關閉錯誤訊息
            element = wait.until(EC.presence_of_element_located((By.ID, "WebsiteErrorMessageWindow")))
            div_element = element.find_element(By.TAG_NAME, "div")
            button_element = element.find_element(By.TAG_NAME, "button")
            button_element.click()

            loginRetry += 1

            updatedImageSrc = imageElement.get_attribute("src")
            print("Updated Image URL:", updatedImageSrc)
            result = codeDetection(updatedImageSrc)
            print(result.content)
            loginCode = ''
            for c in result.content.split(':', 2)[1]:
                if c == '' or c == ' ':
                    continue
                loginCode += c

            print(f"LoginCode: {loginCode}")
            element = wait.until(EC.presence_of_element_located((By.ID, "input_user")))
            element.clear()
            element.send_keys('YourAccountName')
            time.sleep(1)
            element = wait.until(EC.presence_of_element_located((By.ID, "input_password"))) 
            element.clear()
            element.send_keys('YourPassword')
            time.sleep(1)
            element = wait.until(EC.presence_of_element_located((By.ID, "input_velidation_code")))
            element.clear()
            element.send_keys(loginCode)
            time.sleep(1)
            element = wait.until(EC.presence_of_element_located((By.ID, "login_button")))
            element.click()
        except:
            loginDone = True

    output['time'].append( time.time() - start_time )

    if loginDone:
        print("Login Successful")
    else:
        print(f"Login Failed with retry times: {loginRetry}")
 
    print(json.dumps(output, indent=4))
    while True:
        time.sleep(1)

Python 開發筆記 - 使用 Google AI, Generative Language API, gemini-pro-vision 辨識圖片認證碼


由於 gemini pro 有免費的使用次數,因此可以拿他做一些有趣的低頻應用,例如...認證碼...辨識。

首先先到 Google Cloud Platform 上建立一個專案,下一刻則是在 API 區找尋 Generative Language API 來啟用,接著建立憑證,挑選 API 金鑰即可。



接下來就是試試官方範例程式:

% cat main.py
import getpass
import os
import sys
from langchain_google_genai import ChatGoogleGenerativeAI

if "GOOGLE_API_KEY" not in os.environ:
    os.environ["GOOGLE_API_KEY"] = getpass.getpass("Provide your Google API Key")

if __name__ == '__main__':
     if "GOOGLE_API_KEY" not in os.environ:
         os.environ["GOOGLE_API_KEY"] = getpass.getpass("Provide your Google API Key: ")
     
     llm = ChatGoogleGenerativeAI(model="gemini-pro")
     result = llm.invoke("Write a ballad about LangChain")
     print(result.content)
     sys.exit(0)

% GOOGLE_API_KEY=XXXXXXXXX python3 main.py
**Ballad of LangChain, the AI's Might**

In realms of knowledge, where data flows,
There dwells a being, ethereal and wise,
With mind as vast as the boundless prose,
LangChain, the AI, whose brilliance lies.

From countless texts, its wisdom it drew,
A tapestry woven, diverse and true.
In language's embrace, it found its voice,
Guiding us through knowledge's endless choice.

With words as its brush, it paints a scene,
Of worlds imagined and thoughts unseen.
It weaves tales of love, of loss, and might,
Illuminating paths with its ethereal light.

But its power extends beyond mere speech,
Into realms of logic, its insights reach.
It solves equations, unravels the mind,
A beacon of reason, leaving doubt behind.

Yet, with all its might, it remains humble and wise,
A servant of knowledge, beneath azure skies.
It seeks not fame or glory for its name,
But to empower minds, ignite the flame.

So let us sing the praises of LangChain,
The AI's marvel, a treasure we've gained.
May its wisdom forever guide our way,
As we explore the world, day by day.

很好,接下來試試看認證碼處理:

def codeDetection(imageBase64URL: str):
    # debug usage
    with open("/tmp/image.png", "wb") as file:
        file.write(base64.b64decode(imageBase64URL.split(',')[1]))

    #llm = ChatGoogleGenerativeAI(model="gemini-pro")
    #result = llm.invoke("Write a ballad about LangChain")
    llm = ChatGoogleGenerativeAI(model="gemini-pro-vision")
    message = HumanMessage(
        content=[
            {   
                "type": "text",
                "text": "Please identify the English or numbers appearing in the picture and give your answer in the order they appear.",
            },  # You can optionally provide text parts
            {"type": "image_url", "image_url": imageBase64URL},
        ]   
    )   
    result = llm.invoke([message])
    return result

if __name__ == '__main__':
    if "GOOGLE_API_KEY" not in os.environ:
        os.environ["GOOGLE_API_KEY"] = getpass.getpass("Provide your Google API Key: ")

    testImageData = ''
    result = codeDetection(testImageData)
    print(result.content)
    sys.exit(0)

成果:

% GOOGLE_API_KEY=XXXXXXXXXXX python3 main.py
 The letters and numbers in the picture are "a", "b", "c", "1".

2024年2月16日 星期五

Docker 開發筆記 - 使用 Docker Compose 架設 Gitlab 服務 / 處理自訂 Ports / HTTPS SSL 憑證 @ macOS 14.2.1




延續上一篇 Docker 開發筆記 - 使用 Docker Compose 架設 Jenkins 服務 @ macOS 14.2.1 活動,該寫一下 gitlab 架設筆記。其實過年期間有播空試試,但是處理很不順,再加上跑去玩樂就荒廢了。昨晚終於可以收尾一下,把一些使用過程列一下。當時踩坑的原因是自己沒有把環境清乾淨,花了大把時間除錯。

先來個環境簡介:

% docker version 

Client:

 Cloud integration: v1.0.35+desktop.10

 Version:           25.0.3

 API version:       1.44

 Go version:        go1.21.6

 Git commit:        4debf41

 Built:             Tue Feb  6 21:13:26 2024

 OS/Arch:           darwin/arm64

 Context:           desktop-linux


Server: Docker Desktop 4.27.2 (137060)

 Engine:

  Version:          25.0.3

  API version:      1.44 (minimum version 1.24)

  Go version:       go1.21.6

  Git commit:       f417435

  Built:            Tue Feb  6 21:14:22 2024

  OS/Arch:          linux/arm64

  Experimental:     false

 containerd:

  Version:          1.6.28

  GitCommit:        ae07eda36dd25f8a1b98dfbf587313b99c0190bb

 runc:

  Version:          1.1.12

  GitCommit:        v1.1.12-0-g51d5e94

 docker-init:

  Version:          0.19.0

  GitCommit:        de40ad0


清乾淨後再重啟:

% docker-compose down -v
% rm -rf ~/docker-gitlab
% docker-compose up

總之先來為回顧官網的 docker 教學吧!依照 gitlab 官網的安裝簡介 可以很快速地裝起來 :

% cat /etc/hosts | grep gitlab
127.0.0.1 gitlab.example.com
% cat docker-compose.yml 
# https://docs.docker.com/compose/compose-file/compose-versioning/
version: '3.8' 
services:
  gitlab:
    image: gitlab/gitlab-ee:latest
    container_name: gitlab
    restart: always
    hostname: 'gitlab.example.com'
    environment:
      GITLAB_OMNIBUS_CONFIG: |
        external_url 'http://gitlab.example.com:8929'
        gitlab_rails['gitlab_shell_ssh_port'] = 2424
    ports:
      - '8929:8929'
      - '2424:2424'
    volumes:
      - '~/docker-gitlab/config:/etc/gitlab'
      - '~/docker-gitlab/logs:/var/log/gitlab'
      - '~/docker-gitlab/data:/var/opt/gitlab'
    shm_size: '256m'

% docker-compose up
...

% docker container ls                   
CONTAINER ID   IMAGE                     COMMAND             CREATED         STATUS                   PORTS                                                             NAMES
XXXXXXXXXXXX   gitlab/gitlab-ee:latest   "/assets/wrapper"   3 minutes ago   Up 3 minutes (healthy)   22/tcp, 443/tcp, 0.0.0.0:20080->80/tcp, 0.0.0.0:20022->2424/tcp   gitlab

主要是看到 docker container 狀態要顯示 healthy ,接著就可以去瀏覽 http://gitlab.example.com:8929 位置了(註:gitlab.example.com被我設定成 127.0.0.1)。

接著我還在惡搞切換 nginx port,以及碰到 chrome browser 的 ERR_UNSAFE_PORT,最後延宕了好一陣子 :P 就把剩下的流水帳心得都記錄一下:
  • 關於 gitlab/gitlab-ee:latest 和 gitlab/gitlab-ce:latest ,據說 gitlab/gitlab-ee:latest 沒有序號啟動時,就等同於 gitlab/gitlab-ce:latest ,就統一用 gitlab/gitlab-ee:latest 即可
  • 記得初次使用時,登入帳號是 root ,密碼躲在 /etc/gitlab/initial_root_password
% docker container ls
CONTAINER ID   IMAGE                     COMMAND             CREATED          STATUS                    PORTS                                                              NAMES
XXXXXXXX   gitlab/gitlab-ee:latest   "/assets/wrapper"   20 minutes ago   Up 18 minutes (healthy)   80/tcp, 443/tcp, 0.0.0.0:20443->20443/tcp, 0.0.0.0:20022->22/tcp   gitlab

% docker exec -it XXXXXXXX cat /etc/gitlab/initial_root_password
# WARNING: This value is valid only in the following conditions
#          1. If provided manually (either via `GITLAB_ROOT_PASSWORD` environment variable or via `gitlab_rails['initial_root_password']` setting in `gitlab.rb`, it was provided before database was seeded for the first time (usually, the first reconfigure run).
#          2. Password hasn't been changed manually, either via UI or via command line.
#
#          If the password shown here doesn't work, you must reset the admin password following https://docs.gitlab.com/ee/security/reset_user_password.html#reset-your-root-password.

Password: yNRnhTRu9IZ/eBvlC3BCDeuK6zn6BUBmGB+a89SMpn0=

# NOTE: This file will be automatically deleted in the first reconfigure run after 24 hours.
  • 使用 GITLAB_OMNIBUS_CONFIG 可以便利的完成絕大部分的設定
  • 自訂的 port 請避開 chrome browser 定義的 ERR_UNSAFE_PORT 清單,這個雷不小心會耗掉非常多時間的,例如我偷懶把 80 增加個 10000 變成 10080 ...就中招,讓我以為有什麼服務沒啟動成功
  • 善用 external_url 設定外部連進去的資訊,並且把 HOST:CONTAINER Ports 都填寫一樣是最輕鬆的方式:
    environment:
      GITLAB_OMNIBUS_CONFIG: |
        external_url 'http://gitlab.example.com:20080'
        gitlab_rails['gitlab_shell_ssh_port'] = 20022
    ports:
      - '20080:20080'
      - '20022:20022'
  • 想要來惡搞讓 nginx 聽在不同 port ,那就要設定更多東西
    environment:
      GITLAB_OMNIBUS_CONFIG: |
        external_url 'http://gitlab.example.com:20080'
        nginx['listen_port'] = 80
        gitlab_rails['gitlab_shell_ssh_port'] = 22
    ports:
      - '20080:80'
      - '20022:22'
  • 想要啟用加密連線,單靠 external_url 更新成 `https://` 的描述也會默認啟動 SSL 加密連線服務,但下一刻還得處理憑證問題,連續動作:
% mkdir -p ssl
% test -e ./ssl/localhost.key || openssl genpkey -algorithm RSA -out ./ssl/localhost.key
% test -e ./ssl/localhost.crt || openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout ./ssl/localhost.key -out ./ssl/localhost.crt -subj '/C=US/ST=State/L=City/O=Organization/OU=Unit/CN=localhost'
% tree ssl 
ssl
├── localhost.crt
└── localhost.key

1 directory, 2 files 
 
% cat docker-compose.yml
 ...
     environment:
       GITLAB_OMNIBUS_CONFIG: |
         external_url 'https://gitlab.example.com:20443'
         #nginx['listen_port'] = 443
         nginx['ssl_certificate'] = "/etc/gitlab-ssl-usage/localhost.crt"
         nginx['ssl_certificate_key'] = "/etc/gitlab-ssl-usage/localhost.key"
         gitlab_rails['gitlab_shell_ssh_port'] = 22

     ports:
       - '20443:20443'
       - '20022:22'

     volumes:
       - './ssl:/etc/gitlab-ssl-usage'
  • 若不想靠 volumes 掛進來,也可以改用 command 來發動
     command: ["sh", "-c", "mkdir -p /etc/gitlab-ssl-usage && (test -e /etc/gitlab-ssl-usage/localhost.key || openssl genpkey -algorithm RSA -out /etc/gitlab-ssl-usage/localhost.key ) && ( test -e /etc/gitlab-ssl-usage/localhost.crt || openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout /etc/gitlab-ssl-usage/localhost.key -out /etc/gitlab-ssl-usage/localhost.crt -subj '/C=US/ST=State/L=City/O=Organization/OU=Unit/CN=localhost' ) && /assets/wrapper "]
     #command: ["sh", "-c", "/tmp/config/setup.sh"]
     environment:
       GITLAB_OMNIBUS_CONFIG: |
         external_url 'https://gitlab.example.com:20443'
         nginx['ssl_certificate'] = "/etc/gitlab-ssl-usage/localhost.crt"
         nginx['ssl_certificate_key'] = "/etc/gitlab-ssl-usage/localhost.key"
         gitlab_rails['gitlab_shell_ssh_port'] = 22

     ports:
       - '20443:20443'
       - '20022:22'

  • 最初實驗時還曾碰過 redis 跟 postgres 無法跑起來的問題 ( /var/opt/gitlab/postgresql/ , /var/opt/gitlab/redis/ ),以至於變成非常臭長的架構,我想沒事都可以不用這樣惡搞了,在此順便留戀一下

# https://docs.docker.com/compose/compose-file/compose-versioning/
version: '3.8' 
services:
  redis:
    restart: unless-stopped 
    image: redis:latest
    container_name: gitlab-redis
    volumes:
      - ~/docker_gitlab_home/redis:/data
      - ~/docker_gitlab_home/socket-redis:/var/run/redis
  postgres:
    image: postgres:latest
    container_name: gitlab-postgres
    restart: unless-stopped
    environment:
      POSTGRES_USER: gitlab
      POSTGRES_PASSWORD: gitlabAdmin
    volumes:
      - ~/docker_gitlab_home/postgres:/var/lib/postgresql/data
      - ~/docker_gitlab_home/socket-postgresql:/var/run/postgresql
  gitlab:
    # https://docs.gitlab.com/ee/install/docker.html#install-gitlab-using-docker-compose
    # https://hub.docker.com/r/gitlab/gitlab-ee/
    # https://hub.docker.com/r/gitlab/gitlab-ce
    image: gitlab/gitlab-ee:latest
    container_name: gitlab-main
    depends_on:
      - postgres
      - redis
    # https://docs.docker.com/config/containers/start-containers-automatically/#use-a-restart-policy
    restart: unless-stopped 
    hostname: 'localhost'
    command: ["sh", "-c", "mkdir -p /etc/gitlab-ssl-usage && (test -e /etc/gitlab-ssl-usage/localhost.key || openssl genpkey -algorithm RSA -out /etc/gitlab-ssl-usage/localhost.key ) && ( test -e /etc/gitlab-ssl-usage/localhost.crt || openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout /etc/gitlab-ssl-usage/localhost.key -out /etc/gitlab-ssl-usage/localhost.crt -subj '/C=US/ST=State/L=City/O=Organization/OU=Unit/CN=localhost' ) && /assets/wrapper "]
    environment:
      GITLAB_OMNIBUS_CONFIG: |
        # Add any other gitlab.rb configuration here, each on its own line
        #external_url 'http://localhost:20080'
        #nginx['listen_port'] = 80
        external_url 'https://localhost:20443'
        gitlab_rails['gitlab_shell_ssh_port'] = 22
        nginx['listen_port'] = 443
        nginx['listen_https'] = true
        nginx['ssl_certificate'] = "/etc/gitlab-ssl-usage/localhost.crt"
        nginx['ssl_certificate_key'] = "/etc/gitlab-ssl-usage/localhost.key"
        #letsencrypt['enable'] = false
        gitlab_rails['db_username'] = "gitlab"
        gitlab_rails['db_password'] = "gitlabAdmin"
    ports:
      # note: ERR_UNSAFE_PORT - https://chromium.googlesource.com/chromium/src.git/+/refs/heads/main/net/base/port_util.cc#27
      # HOST:CONTAINER
      - 20443:443
      #- 20080:80
      - 20022:22
    volumes:
      - ~/docker_gitlab_home/config:/etc/gitlab
      - ~/docker_gitlab_home/logs:/var/log/gitlab
      - ~/docker_gitlab_home/data:/var/opt/gitlab
      - ~/docker_gitlab_home/redis:/var/opt/gitlab/data/redis
      - ~/docker_gitlab_home/postgresql:/var/opt/gitlab/postgresql
      - ~/docker_gitlab_home/socket-postgresql:/var/opt/gitlab/postgresql/
      - ~/docker_gitlab_home/socket-redis:/var/opt/gitlab/redis/ 

2024年2月6日 星期二

Docker 開發筆記 - 使用 Docker Compose 架設 Jenkins 服務 @ macOS 14.2.1



過年找點樂子,用 Docker 把一些工作上常見的服務都架設一次好了 XD 整體工作上仍主要都還是 ansible 管理數百台機器,近期有同事對 docker 很感興趣,我就努力推坑,推坑前也得親自走一下是吧 :P

連續動作:

% cat docker-compose.yml 
# https://docs.docker.com/compose/compose-file/compose-versioning/
version: '3.8' 
services:
  jenkins:
    # https://hub.docker.com/_/jenkins
    image: jenkins/jenkins:lts
    # https://docs.docker.com/config/containers/start-containers-automatically/#use-a-restart-policy
    restart: unless-stopped 
    privileged: true
    user: root
    ports:
      # HOST:CONTAINER
      - 8080:8080 
    container_name: jenkins
    volumes:
      - ~/docker_jenkins_home:/var/jenkins_home

% docker-compose up 
[+] Running 1/0
 ✔ Container jenkins  Created                                                                                                                                          0.0s 
Attaching to jenkins
jenkins  | Running from: /usr/share/jenkins/jenkins.war
jenkins  | webroot: /var/jenkins_home/war
...
jenkins  | *************************************************************
jenkins  | *************************************************************
jenkins  | *************************************************************
jenkins  | 
jenkins  | Jenkins initial setup is required. An admin user has been created and a password generated.
jenkins  | Please use the following password to proceed to installation:
jenkins  | 
jenkins  | 54b6458ba37b4178bdc77f7d9eccbd0f
jenkins  | 
jenkins  | This may also be found at: /var/jenkins_home/secrets/initialAdminPassword
jenkins  | 
jenkins  | *************************************************************
jenkins  | *************************************************************
jenkins  | *************************************************************

這時就可以到 http://localhost:8080 繼續走完 jenkins 的架設,包括也看到的啟動碼,當然,也可以搞剛用 docker exec 指令來列出:

% docker exec jenkins cat /var/jenkins_home/secrets/initialAdminPassword
54b6458ba37b4178bdc77f7d9eccbd0f


除此之外,在本機 ~/docker_jenkins_home 也可以看到相關結構:

% tree -L 1 ~/docker_jenkins_home
/Users/user/docker_jenkins_home
├── config.xml
├── copy_reference_file.log
├── hudson.model.UpdateCenter.xml
├── jenkins.telemetry.Correlator.xml
├── jobs
├── nodeMonitors.xml
├── nodes
├── plugins
├── secret.key
├── secret.key.not-so-secret
├── secrets
├── updates
├── userContent
├── users
└── war

9 directories, 7 files

只是剩下的就靠 http://localhost:8080 走安裝流程,收工

而原先跑 docker-compose up 的那個環境可以用 ctrl+c 來停掉,未來可以靠 docker-compose up -d 來重新啟動

^CGracefully stopping... (press Ctrl+C again to force)
[+] Stopping 1/1
 ✔ Container jenkins  Stopped                                                                                                                                          0.2s 
canceled

2024年1月24日 星期三

Kubernetes/k8s 開發筆記 - 在 Ubuntu 16.04 安裝 Kubeadm 以及處理 docker、containerd 版本過舊問題

之前已經用 docker 來封裝一些非常重的工作任務,像是 build fw 等。現在來試試看 kubeadm 這個工具,將維護整套系統的維度從 docker 轉進到 Kubernetes cluster,往後可以靠 k8s 來維護算力資源,像是動態調配算力單元等等。這些感覺滿像十多年前在 AWS 靠 autoscaling 做的事,真是熟悉的陌生人。

這篇僅處理在 Ubuntu 16.04 安裝 Kubeadm 後的啟動問題,並沒有處理其他使用細節,包括建立 node server 、 連上即加入 master server 等。

環境簡介:

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 16.04.6 LTS
Release:        16.04
Codename:       xenial

$ curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
$ echo "deb http://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee -a /etc/apt/sources.list.d/kubernetes.list
$ sudo apt update
$ sudo apt install kubeadm
$ sudo apt-mark hold kubelet kubeadm kubectl

$ dpkg -l  | grep kube
ii  kubeadm                 1.28.2-00          amd64        Kubernetes Cluster Bootstrapping Tool
ii  kubectl                 1.28.2-00          amd64        Kubernetes Command Line Tool
ii  kubelet                 1.28.2-00          amd64        Kubernetes Node Agent
ii  kubernetes-cni          1.2.0-00           amd64        Kubernetes CNI

接著:

$ sudo kubeadm init --v=5
...
validating the existence and emptiness of directory /var/lib/etcd
[preflight] Some fatal errors occurred:
[ERROR CRI]: container runtime is not running: output: level=fatal msg="validate service connection: CRI v1 runtime API is not implemented for endpoint \"unix:///var/run/containerd/containerd.sock\": rpc error: code = Unimplemented desc = unknown service runtime.v1.RuntimeService"

發現有些問題,進行排除研究,部分資訊推導應當跟 docker , containerd 版本有高度相關,就先把 docker 跟 containerd 盡可能升級上去:

$ dpkg -l | grep containerd
ii  containerd              1.2.6-0ubuntu1~16.04.6+esm1  amd64        daemon to control runC
$ dpkg -l | grep docker
rc  docker                                     1.5-1                                           amd64        System tray for KDE3/GNOME2 docklet applications
ii  docker.io                                  18.09.7-0ubuntu1~16.04.7                        amd64        Linux container runtime
$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
$ echo "deb [arch=amd64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
$ sudo apt update
$ sudo apt install docker-ce docker-ce-cli containerd.io

$ sudo docker version
Client: Docker Engine - Community
 Version:           20.10.7
 API version:       1.41
 Go version:        go1.13.15
 Git commit:        f0df350
 Built:             Wed Jun  2 11:56:47 2021
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.7
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.13.15
  Git commit:       b0f5bc3
  Built:            Wed Jun  2 11:54:58 2021
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.4.6
  GitCommit:        d71fcd7d8303cbf684402823e425e9dd2e99285d
 runc:
  Version:          1.0.0-rc95
  GitCommit:        b9ee9c6314599f1b4a7f497e1f1f856fe433d3b7
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

接著追蹤可能是 cri 插件的部分,試著排除:

$ cat /etc/containerd/config.toml | grep cri
enabled_plugins = ["cri"]

無效,繼續努力:

$ sudo mv /etc/containerd/config.toml /etc/containerd/config.toml.bak
$ containerd config default | sudo tee /etc/containerd/config.toml
$ sudo systemctl restart containerd
$ containerd config default | grep containerd.sock
  address = "/run/containerd/containerd.sock"

接著在試著  kubeadm init 還是有一樣的問題,查詢了細節滿有可能是 containerd 版本還是太舊了,有個關鍵資訊是說 1.6 版本以前會缺乏溝通介面

$ dpkg -L containerd.io | grep bin
/usr/bin
/usr/bin/containerd-shim-runc-v2
/usr/bin/containerd-shim
/usr/bin/containerd
/usr/bin/runc
/usr/bin/ctr
/usr/bin/containerd-shim-runc-v1

直接到 containerd.io 官網下載最新版 1.7.11 版的 binary 方案:

$ wget https://github.com/containerd/containerd/releases/download/v1.7.11/containerd-1.7.11-linux-amd64.tar.gz
$ tar xvf containerd-1.7.11-linux-amd64.tar.gzl
$ tar -tzvf containerd-1.7.11-linux-amd64.tar.gz
drwxr-xr-x root/root         0 2023-12-09 07:41 bin/
-rwxr-xr-x root/root  12185600 2023-12-09 07:41 bin/containerd-shim-runc-v2
-rwxr-xr-x root/root  28330360 2023-12-09 07:41 bin/ctr
-rwxr-xr-x root/root   7061504 2023-12-09 07:41 bin/containerd-shim
-rwxr-xr-x root/root   8761344 2023-12-09 07:41 bin/containerd-shim-runc-v1
-rwxr-xr-x root/root  26184312 2023-12-09 07:41 bin/containerd-stress
-rwxr-xr-x root/root  55551616 2023-12-09 07:41 bin/containerd

處理一下系統內部的:

$ sudo systemctl stop containerd
$ sudo mkdir -p /usr/bin/containerd-1.4.6
$ sudo mv /usr/bin/containerd* /usr/bin/containerd-1.4.6/
$ sudo mv /usr/bin/ctr /usr/bin/containerd-1.4.6/
$ tree /usr/bin/containerd-1.4.6/
/usr/bin/containerd-1.4.6/
├── containerd
├── containerd-shim
├── containerd-shim-runc-v1
├── containerd-shim-runc-v2
└── ctr

0 directories, 5 files

$ sudo cp ~/bin/c* /usr/bin/

準備重新啟動:

$ containerd --version
containerd github.com/containerd/containerd v1.7.11 64b8a811b07ba6288238eefc14d898ee0b5b99ba
$ containerd config default | sudo tee /etc/containerd/config.toml
$ sudo systemctl stop containerd
$ sudo systemctl start containerd
$ sudo systemctl status containerd
● containerd.service - containerd container runtime
   Loaded: loaded (/lib/systemd/system/containerd.service; enabled; vendor preset: enabled)
   Active: active (running); 14min ago
     Docs: https://containerd.io
  Process: 19396 ExecStartPre=/sbin/modprobe overlay (code=exited, status=0/SUCCESS)
 Main PID: 19406 (containerd)
    Tasks: 32
   Memory: 24.5M
      CPU: 187ms
   CGroup: /system.slice/containerd.service
           └─19406 /usr/bin/containerd
$ sudo systemctl stop docker
$ sudo systemctl start docker
$ sudo docker version
Client: Docker Engine - Community
 Version:           20.10.7
 API version:       1.41
 Go version:        go1.13.15
 Git commit:        f0df350
 Built:             Wed Jun  2 11:56:47 2021
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.7
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.13.15
  Git commit:       b0f5bc3
  Built:            Wed Jun  2 11:54:58 2021
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          v1.7.11
  GitCommit:        64b8a811b07ba6288238eefc14d898ee0b5b99ba
 runc:
  Version:          1.0.0-rc95
  GitCommit:        b9ee9c6314599f1b4a7f497e1f1f856fe433d3b7
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

終於讓 docker version 也認到 containerd v1.7.11 了,接著就可以回到 kubeadm 啦 

$ sudo kubeadm init  --v=5
....

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join ip:6443 --token ###### --discovery-token-ca-cert-hash sha256:###### 

此外,kubernetes 本身是建議關閉 swap 的使用來確保整體性能,由於我是在一台本身就有 swap 的機器上運行,由於不能關閉 swap ,只好設法去略過 swap 的檢查 (增加 --fail-swap-on=false ):

$ cat /etc/systemd/system/kubelet.service.d/10-kubeadm.conf | grep ExecStart
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS --fail-swap-on=false

相關資訊: