AI21LabのJurassic-2 UltraをSageMakerにデプロイする手順

2023年6月22日 · 約3分

moritalous

Maintainer of this blog

AI21 Jurassic-2 UltraをSageMakerにデプロイする手順を解説します。

AI21 Jurassic-2 Ultraのサブスクリプション手順

https://aws.amazon.com/jp/sagemaker/jumpstart にアクセスし、Amazon SageMaker JumpStart の使用を開始するをクリック
「AI21」で検索し、AI21 Jurassic-2 Ultraをクリック
Continue to Subscribeをクリック、サブスクリプションを有効化

推論エンドポイントの作成

モデルのサブスクリプションができたので、続いて推論エンドポイントを作成します。

SageMaker Studioで行うのが正攻法だと思いますが、手元のVSCode上のノートブックで実施しました。

参考： https://github.com/AI21Labs/SageMaker

ライブラリーの取得

Python
! pip install -U "ai21[AWS]"
import ai21

ライブラリーのインポート

Python
import json
from sagemaker import ModelPackage
import boto3

runtime_sm_client = boto3.client("sagemaker")

リージョンやモデルのARNなどを指定

Python
region = boto3.Session().region_name
model_package_arn = 'arn:aws:sagemaker:us-east-1:865070037744:model-package/j2-ultra-v1-1-053-65756ea489973147b387b960b7f5b02d'

endpoint_name = "j2-ultra"

content_type = "application/json"

real_time_inference_instance_type = (
    # "ml.p4d.24xlarge"    # Recommended instance
    "ml.g5.48xlarge"   # Cheaper and faster - recommended for relatively short inputs/outputs
    # "ml.p4de.24xlarge" # Recommended for long inputs/outputs and faster performance
)

注記

ml.p4d.24xlargeがおすすめのようです。

ちなみにスペックは以下の通りです。 桁違いのモンスター級スペック！！

インスタンスタイプ	vCPU	メモリ(GiB)	GPUs	GPUメモリ(GB)	価格(/h)	価格(/720h)
ml.g5.48xlarge	192	768	8	192	$20.3600	$14,659.2000
ml.p4d.24xlarge	96	1152	8	320	$37.6885	$27,135.7200
ml.p4de.24xlarge	96	1152	8	640	$47.1106	$33,919.6320

1ドル120円換算で、ml.p4de.24xlargeの場合は月400万円を超えます😂

エンドポイントを作成

Python
# create a deployable model from the model package.
model = ModelPackage(
    role='arn:aws:iam::xxxxx:role/service-role/AmazonSageMaker-ExecutionRole-20180403T230664',
    model_package_arn=model_package_arn
)

# Deploy the model
predictor = model.deploy(1, real_time_inference_instance_type, endpoint_name=endpoint_name, 
     model_data_download_timeout=3600,
     container_startup_health_check_timeout=600,
    )

エンドポイント作成後から課金開始です！！

推論実行

AI21のライブラリーを経由してAPI呼び出しを行います。

Python
response = ai21.Completion.execute(destination=ai21.SageMakerDestination(endpoint_name),
                                   prompt="To be, or",
                                   maxTokens=4,
                                   temperature=0,
                                   numResults=1)

print(response['completions'][0]['data']['text'])

output

 not to be
That is the question

使い終わったらエンドポイント消してね！

AI21 Jurassic-2 Ultraのサブスクリプション手順​

推論エンドポイントの作成​

推論実行​

AI21 Jurassic-2 Ultraのサブスクリプション手順

推論エンドポイントの作成

推論実行