簡言之,費用比 AWS S3 便宜,但是存取 AWS Glacier 的行為更像傳統磁帶模式,很多操作動作,會先給你一個 Job ID,還得手動用此 ID 去確認結果。例如想查閱已經有多少檔案,會發一個需求 Job ID 給你,而非馬上跟你說已經有多少檔案了。
整體上,要上傳資料一定要走 AWS API 方式,也就是到 AWS IAM 上建立一個 user,給予以下權限:
- Amazon Glacier Full Access (另一個則是 Amazon Glacier Read Only Access)
- Amazon SQS Full Access (下載檔案會用到, 若工作僅上傳則不需要, 沒有時的錯誤訊息: Access to the resource https://sqs.ap-northeast-1.amazonaws.com/ is denied)
- Amazon SNS Full Access (下載檔案會用到, 若工作僅上傳則不需要, 沒有時的錯誤訊息: User: arn:aws:iam::####:user/#### is not authorized to perform: SNS:CreateTopic on resource)
此外,使用 AWS Glacier 前,要稍微瞭解一下操作方式
- 不同的 Data Center 會因為電價等關係,所以儲存的費用不一樣。
- 上傳資料前,除了要挑選 Data Center 外,還需要類似建立一個類似目錄的儲存單位 (Vault)
- 除了網頁版 GUI 可以進行 Create Vault, Delete Vault 外,其餘都一律透過 API 進行,其中 Delete Vault 還必須確定裡頭沒其他檔案
- 透過 API 操作時,需要的基本參數為 API KEY、Data Center (region/endpoint) 等資料
建立 Vault (可透過 AWS Web Console)操作:
$ java -jar uploader-0.0.8-SNAPSHOT-jar-with-dependencies.jar -e "https://glacier.ap-northeast-1.amazonaws.com" -v changyy-vault -c
INFO Using end point: https://glacier.ap-northeast-1.amazonaws.com
INFO Creating vault changyy-vault...
INFO Vault changyy-vault created. {Location: /##########/vaults/changyy-vault}
LastInventoryDate: null
NumberOfArchives: 0
SizeInBytes: 0
VaultARN: arn:aws:glacier:ap-northeast-1:##########:vaults/changyy-vault
VaultName: changyy-vault
上傳檔案(最後的 archive 接的資料就是該筆料的 ID):
$ java -jar uploader-0.0.8-SNAPSHOT-jar-with-dependencies.jar -e "https://glacier.ap-northeast-1.amazonaws.com" -v changyy-vault --upload ~/uploader-0.0.8-SNAPSHOT-jar-with-dependencies.jar
INFO Using end point: https://glacier.ap-northeast-1.amazonaws.com
INFO Starting to upload $HOME/uploader-0.0.8-SNAPSHOT-jar-with-dependencies.jar to vault changyy-vault...
INFO Uploaded archive ########################################
切檔上傳,此例是 128MB 為單位,適合檔案很大的情境(Archive ID 接的資料就是該筆料的 ID):
$ java -jar uploader-0.0.8-SNAPSHOT-jar-with-dependencies.jar -e "https://glacier.ap-northeast-1.amazonaws.com" -v changyy-vault --upload ~/uploader-0.0.8-SNAPSHOT-jar-with-dependencies.jar --multipartupload ~/TargetBigFile --partsize 134217728
INFO Using end point: https://glacier.ap-northeast-1.amazonaws.com/
INFO Multipart uploading TargetBigFile to vault changyy-vault with part size 134217728 (128.00MB).
INFO Upload ID (token): ################################################
INFO Part 1/187 (bytes 0-134217727/*) uploaded, checksum: e19319be5e3c5d3f45a1ce7ef9ab3644b6933ec01c0754285babf45eb46b5b0b
...
INFO Part 187/187 (bytes 24964497408-24993715534/*) uploaded, checksum: d0219f53b4f54431495211bfd8880fe52597354acc51aa077dcb67cacee69f53
INFO Uploaded Archive ID: ################################################
INFO Local Checksum: a1500723e11892cc2bb297d5d6f97a08035e30810ae0e3342184fbed4e2c2d5b
INFO Remote Checksum: a1500723e11892cc2bb297d5d6f97a08035e30810ae0e3342184fbed4e2c2d5b
INFO Checksums are identical, upload succeeded.
然而,剛上傳完是無法馬上下載的 :P 而想要查詢檔案列也是,必須先發一個"查詢列表"的工作(得到 Job ID),等工作結束後才能查詢到結果(取得檔案列表)
發出"查詢 Vault 檔案列表"的需求:
$ java -jar uploader-0.0.8-SNAPSHOT-jar-with-dependencies.jar -e "https://glacier.ap-northeast-1.amazonaws.com" -v changyy-vault -l
INFO Using end point: https://glacier.ap-northeast-1.amazonaws.com
INFO Starting inventory listing for vault changyy-vault...
INFO Inventory Job created with ID
8KDBk2AS_9bYC8dIOBHxjitqxaLhEklXPfU6jZO-t-su3cp1k3NaHFIUpFaBiJDXDGFyzYyqaw-3MboGNlJ2W6kKDzmt
若這個 Vault 是剛建立的,還會有類似錯誤訊息:vaults/changyy-vault cannot be initiated yet, as Amazon Glacier has not yet generated an initial inventory for this vault.
取得"查詢 Vault 檔案列表"的結果:
$ java -jar uploader-0.0.8-SNAPSHOT-jar-with-dependencies.jar -e "https://glacier.ap-northeast-1.amazonaws.com" -v changyy-vault -l 8KDBk2AS_9bYC8dIOBHxjitqxaLhEklXPfU6jZO-t-su3cp1k3NaHFIUpFaBiJDXDGFyzYyqaw-3MboGNlJ2W6kKDzmt
若工作未做完,會顯示錯誤訊息:ERROR The job is not currently available for download。做完的話,會顯示清單,其中 Description 在這套 Java 工具下,會自動填寫上傳的檔名:
ARN: arn:aws:glacier:ap-northeast-1:##############:vaults/changyy-vault
------------------------------------------------------------------------------
Description: uploader-0.0.8-SNAPSHOT-jar-with-dependencies.jar
Archive ID: q_SfW7MNmTE1_9xBbmzP5MvnEGYYmF8wCIe2aYs4_7NAXjn8fEO4nl97QZ-deJ_hDsKni7n5z0avn8gEdAnFfzMV4xE9FlF2Fr3UualyZj0b4LNSq9cENWYWoueSma9Kq8zGuwA9IA
CreationDate: 2015-02-07T11:17:36Z
Size: 19496656 (18.60MB)
SHA: d07cddbcbe3a83dba2b4ca654760bba4b77f92ae1ecc9f1fbffad337730fece0
下載檔案:
$ java -jar uploader-0.0.8-SNAPSHOT-jar-with-dependencies.jar -e "https://glacier.ap-northeast-1.amazonaws.com" -v changyy-vault --download q_SfW7MNmTE1_9xBbmzP5MvnEGYYmF8wCIe2aYs4_7NAXjn8fEO4nl97QZ-deJ_hDsKni7n5z0avn8gEdAnFfzMV4xE9FlF2Fr3UualyZj0b4LNSq9cENWYWoueSma9Kq8zGuwA9IA --target /tmp/test.jar
INFO Using end point: https://glacier.ap-northeast-1.amazonaws.com
INFO Downloading archive q_SfW7MNmTE1_9xBbmzP5MvnEGYYmF8wCIe2aYs4_7NAXjn8fEO4nl97QZ-deJ_hDsKni7n5z0avn8gEdAnFfzMV4xE9FlF2Fr3UualyZj0b4LNSq9cENWYWoueSma9Kq8zGuwA9IA from vault changyy-vault...
INFO Archive downloaded to /tmp/test.jar
整個過程不會馬上進入下載 Orz 例如我下載一個 52MB 的檔案,整個過程耗費 245 分鐘...絕對不是下載速度太慢,而是準備流程要等好一陣子。
刪除檔案:
$ java -jar uploader-0.0.8-SNAPSHOT-jar-with-dependencies.jar -e "https://glacier.ap-northeast-1.amazonaws.com" -v changyy-vault --delete q_SfW7MNmTE1_9xBbmzP5MvnEGYYmF8wCIe2aYs4_7NAXjn8fEO4nl97QZ-deJ_hDsKni7n5z0avn8gEdAnFfzMV4xE9FlF2Fr3UualyZj0b4LNSq9cENWYWoueSma9Kq8zGuwA9IA
INFO Using end point: https://glacier.ap-northeast-1.amazonaws.com
INFO Deleting archive q_SfW7MNmTE1_9xBbmzP5MvnEGYYmF8wCIe2aYs4_7NAXjn8fEO4nl97QZ-deJ_hDsKni7n5z0avn8gEdAnFfzMV4xE9FlF2Fr3UualyZj0b4LNSq9cENWYWoueSma9Kq8zGuwA9IA from vault changyy-vault...
INFO Archive q_SfW7MNmTE1_9xBbmzP5MvnEGYYmF8wCIe2aYs4_7NAXjn8fEO4nl97QZ-deJ_hDsKni7n5z0avn8gEdAnFfzMV4xE9FlF2Fr3UualyZj0b4LNSq9cENWYWoueSma9Kq8zGuwA9IA deletion started from vault changyy-vault.
刪除 Vault:
$ java -jar uploader-0.0.8-SNAPSHOT-jar-with-dependencies.jar -e "https://glacier.ap-northeast-1.amazonaws.com" -v changyy-vault --delete-vault
INFO Using end point: https://glacier.ap-northeast-1.amazonaws.com
INFO Deleting vault changyy-vault...
若 vault 內還有檔案會有錯誤訊息:Vault not empty or recently written to: arn:aws:glacier:ap-northeast-1:############:vaults/changyy-vault。此外,若先去刪檔案,再來執行也會有一樣的問題,因為這是冰川啊 XD 刪檔也是慢慢地
整體心得,AWS Glacier 操作上真的很煩,因為太慢了。此外,也必須把那些 File Archive ID 記好,或是任何工作的 Job ID 記好,後續才能工作。有興趣可以在玩看看視窗介面,例如 CrossFTP 等,在操作檔案列表時,會跟你說要數小時(>5小時)才會得知結果,唯一的好處就是 CrossFTP 會幫你把一些 Job ID 記住吧
沒有留言:
張貼留言