{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "9fb44f6d",
   "metadata": {
    "tags": []
   },
   "source": [
    "# Introduction to SageMaker JumpStart - Text Generation with Mistral models\n",
    "\n",
    "---\n",
    "In this demo notebook, we demonstrate how to use the SageMaker Python SDK to fine-tuning and deploy [Mistral 7B](mistralai/Mistral-7B-v0.1) models for text generation. For fine-tuning, we include two types of fine-tuning: instruction fine-tuning and domain adaption fine-tuning.  \n",
    "\n",
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e5c76151-390e-4592-a192-a857292e4e53",
   "metadata": {},
   "source": [
    "Below is the content of the notebook.\n",
    "\n",
    "1. [Instruction fine-tuning](#1.-Instruction-fine-tuning)\n",
    "   * [1.1. Preparing training data](#1.1.-Preparing-training-data)\n",
    "   * [1.2. Prepare training parameters](#1.2.-Prepare-training-parameters)\n",
    "   * [1.3. Starting training](#1.3.-Starting-training)\n",
    "   * [1.4. Deploying inference endpoints](#1.4.-Deploying-inference-endpoints)\n",
    "   * [1.5. Clean up endpoint](#1.6.-Clean-up-the-endpoint)\n",
    "2. [Domain adaptation fine-tuning](#2.-Domain-adaptation-fine-tuning)\n",
    "   * [2.1. Preparing training data](#2.1.-Preparing-training-data)\n",
    "   * [2.2. Prepare training parameters](#2.2.-Prepare-training-parameters)\n",
    "   * [2.3. Starting training](#2.3.-Starting-training)\n",
    "   * [2.4. Deploying inference endpoints](#2.4.-Deploying-inference-endpoints)\n",
    "   * [2.5. Running inference queries and compare model performances](#2.5.-Running-inference-queries-and-compare-model-performances)\n",
    "   * [2.6. Clean up endpoint](#2.6.-Clean-up-the-endpoint)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "066d7593-7300-4ed6-8f78-63c4b889da72",
   "metadata": {},
   "source": [
    "Install latest SageMaker and dependencies."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9b05b931-992e-4526-978d-f03196874a3b",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "!pip install sagemaker --quiet --upgrade --force-reinstall\n",
    "!pip install ipywidgets==7.0.0 --quiet\n",
    "!pip install datasets --quiet"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3460e23d-7011-4639-94a2-c55f53a035c9",
   "metadata": {},
   "source": [
    "## 1. Instruction fine-tuning\n",
    "\n",
    "Now, we demonstrate how to instruction-tune `huggingface-llm-mistral-7b` model for a new task. The Mistral-7B-v0.1 Large Language Model (LLM) is a pretrained generative text model with 7 billion parameters. Mistral-7B-v0.1 outperforms Llama 2 13B on all benchmarks we tested. For details, see its [HuggingFace webpage](https://huggingface.co/mistralai/Mistral-7B-v0.1)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "19d90321-54c7-4572-a3f8-bf1810cfd112",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "model_id, model_version = \"huggingface-llm-mistral-7b\", \"*\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "aba27120-e9ca-4408-b0f0-80ef5a1a5531",
   "metadata": {},
   "source": [
    "### 1.1. Preparing training data\n",
    "\n",
    "You can fine-tune on the dataset with domain adaptation format or instruction tuning format. In this section, we will use a subset of [Dolly dataset](https://huggingface.co/datasets/databricks/databricks-dolly-15k) in an instruction tuning format. Dolly dataset contains roughly 15,000 instruction following records for various categories such as question answering, summarization, information extraction etc. It is available under Apache 2.0 license. We will select the summarization examples for fine-tuning.\n",
    "\n",
    "Training data is formatted in JSON lines (.jsonl) format, where each line is a dictionary representing a single data sample. All training data must be in a single folder, however it can be saved in multiple jsonl files. The training folder can also contain a template.json file describing the input and output formats."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "976cb8a9-be2f-436b-9444-a52fd0caf518",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "import boto3\n",
    "import sagemaker\n",
    "import json\n",
    "\n",
    "# Get current region, role, and default bucket\n",
    "aws_region = boto3.Session().region_name\n",
    "aws_role = sagemaker.session.Session().get_caller_identity_arn()\n",
    "output_bucket = sagemaker.Session().default_bucket()\n",
    "\n",
    "# This will be useful for printing\n",
    "newline, bold, unbold = \"\\n\", \"\\033[1m\", \"\\033[0m\"\n",
    "\n",
    "print(f\"{bold}aws_region:{unbold} {aws_region}\")\n",
    "print(f\"{bold}aws_role:{unbold} {aws_role}\")\n",
    "print(f\"{bold}output_bucket:{unbold} {output_bucket}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "01e61aa5-e2cd-4cec-902e-5a4b94bb5817",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "from datasets import load_dataset\n",
    "\n",
    "dolly_dataset = load_dataset(\"databricks/databricks-dolly-15k\", split=\"train\")\n",
    "\n",
    "# To train for question answering/information extraction, you can replace the assertion in next line to example[\"category\"] == \"closed_qa\"/\"information_extraction\".\n",
    "summarization_dataset = dolly_dataset.filter(lambda example: example[\"category\"] == \"summarization\")\n",
    "summarization_dataset = summarization_dataset.remove_columns(\"category\")\n",
    "\n",
    "# We split the dataset into two where test data is used to evaluate at the end.\n",
    "train_and_test_dataset = summarization_dataset.train_test_split(test_size=0.1)\n",
    "\n",
    "# Dumping the training data to a local file to be used for training.\n",
    "train_and_test_dataset[\"train\"].to_json(\"train.jsonl\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8c7609e4-71f6-477d-9ef8-e653185850b4",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "train_and_test_dataset[\"train\"][0]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2c061fc5-e11b-4a72-8726-00ab32452769",
   "metadata": {},
   "source": [
    "The training data must be formatted in JSON lines (.jsonl) format, where each line is a dictionary representing a single data sample. All training data must be in a single folder, however it can be saved in multiple jsonl files. The .jsonl file extension is mandatory. The training folder can also contain a template.json file describing the input and output formats.\n",
    "\n",
    "If no template file is given, the following default template will be used:\n",
    "\n",
    "```json\n",
    "{\n",
    "    \"prompt\": \"Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\\n\\n### Instruction:\\n{instruction}\\n\\n### Input:\\n{context}`,\n",
    "    \"completion\": \"{response}\",\n",
    "}\n",
    "```\n",
    "\n",
    "In this case, the data in the JSON lines entries must include `instruction`, `context`, and `response` fields.\n",
    "\n",
    "Different from using the default prompt template, in this demo we are going to use a custom template (see below)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "30df5174-667c-44ea-ad20-47c44ca0e628",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "import json\n",
    "\n",
    "template = {\n",
    "    \"prompt\": \"Below is an instruction that describes a task, paired with an input that provides further context. \"\n",
    "    \"Write a response that appropriately completes the request.\\n\\n\"\n",
    "    \"### Instruction:\\n{instruction}\\n\\n### Input:\\n{context}\\n\\n\",\n",
    "    \"completion\": \" {response}\",\n",
    "}\n",
    "with open(\"template.json\", \"w\") as f:\n",
    "    json.dump(template, f)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8cbdecf3-cb15-4ef9-ab6c-ebe9c43aab1d",
   "metadata": {},
   "source": [
    "Next, we are going to reformat the SQuAD 2.0 dataset. The processed data is saved as `task-data.jsonl` file. Given the prompt template defined in above cell, each entry in the `task-data.jsonl` file include **`context`** and **`question`** fields. For demonstration purpose, we limit the number of training examples to be 2000."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d28b4216-cd79-4e6a-87a8-0fbcdb4260e2",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "from sagemaker.s3 import S3Uploader\n",
    "import sagemaker\n",
    "import random\n",
    "\n",
    "output_bucket = sagemaker.Session().default_bucket()\n",
    "default_bucket_prefix = sagemaker.Session().default_bucket_prefix\n",
    "\n",
    "# If a default bucket prefix is specified, append it to the s3 path\n",
    "if default_bucket_prefix:\n",
    "    train_data_location = f\"s3://{output_bucket}/{default_bucket_prefix}/dolly_dataset_mistral\"\n",
    "else:\n",
    "    train_data_location = f\"s3://{output_bucket}/dolly_dataset_mistral\"\n",
    "\n",
    "local_data_file = \"train.jsonl\"\n",
    "S3Uploader.upload(local_data_file, train_data_location)\n",
    "S3Uploader.upload(\"template.json\", train_data_location)\n",
    "print(f\"Training data: {train_data_location}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "12a170ae-b6d4-45a2-ade7-4595e97a5ad8",
   "metadata": {},
   "source": [
    "Upload the prompt template (`template.json`) and training data (`task-data.jsonl`) into S3 bucket."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c8f9bf21-1d0f-4ea4-a71a-57b2a341219a",
   "metadata": {},
   "source": [
    "### 1.2. Prepare training parameters"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "56aa35d5-4211-4976-929a-11c52edd62f6",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "from sagemaker import hyperparameters\n",
    "\n",
    "my_hyperparameters = hyperparameters.retrieve_default(\n",
    "    model_id=model_id, model_version=model_version\n",
    ")\n",
    "print(my_hyperparameters)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "12c74a6e-b971-4a41-a8a5-ffd5fa9b2e29",
   "metadata": {},
   "source": [
    "Overwrite the hyperparameters. **Note. You can select the LoRA method for your fine-tuning by selecting peft_type=`lora` in the hyper-parameters.**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8698cf73-7fe4-4d50-9fe1-296a43a8592a",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "my_hyperparameters[\"epoch\"] = \"1\"\n",
    "my_hyperparameters[\"per_device_train_batch_size\"] = \"2\"\n",
    "my_hyperparameters[\"gradient_accumulation_steps\"] = \"2\"\n",
    "my_hyperparameters[\"instruction_tuned\"] = \"True\"\n",
    "print(my_hyperparameters)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6d930ae7",
   "metadata": {},
   "source": [
    "Validate hyperparameters"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d244d372",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "hyperparameters.validate(\n",
    "    model_id=model_id, model_version=model_version, hyperparameters=my_hyperparameters\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9d147475-329c-4caf-90c7-da6eb622c021",
   "metadata": {},
   "source": [
    "### 1.3. Starting training"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "29866d1a-5e3f-4867-aae6-6d024cd608ea",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "from sagemaker.jumpstart.estimator import JumpStartEstimator\n",
    "\n",
    "instruction_tuned_estimator = JumpStartEstimator(\n",
    "    model_id=model_id,\n",
    "    hyperparameters=my_hyperparameters,\n",
    "    instance_type=\"ml.g5.12xlarge\",\n",
    ")\n",
    "instruction_tuned_estimator.fit({\"train\": train_data_location}, logs=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6bddbc6e-1512-4a71-9247-76cf03e9bf4b",
   "metadata": {},
   "source": [
    "Extract Training performance metrics. Performance metrics such as training loss and validation accuracy/loss can be accessed through cloudwatch while the training. We can also fetch these metrics and analyze them within the notebook."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "516c2488-b3cf-4b53-954c-aa409969c9fd",
   "metadata": {},
   "outputs": [],
   "source": [
    "from sagemaker import TrainingJobAnalytics\n",
    "\n",
    "training_job_name = instruction_tuned_estimator.latest_training_job.job_name\n",
    "\n",
    "df = TrainingJobAnalytics(training_job_name=training_job_name).dataframe()\n",
    "df.head(10)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "340745b6-ae16-4c7e-83fc-7be0ffd338e3",
   "metadata": {},
   "source": [
    "### 1.4. Deploying inference endpoints"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3e0011f1-aec4-4ca2-8e7d-0a66798713c1",
   "metadata": {},
   "outputs": [],
   "source": [
    "instruction_tuned_predictor = instruction_tuned_estimator.deploy()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f52a29dc-a10b-40e9-a77d-3f71858aa589",
   "metadata": {},
   "source": [
    "Note. For dolly dataset, we observe the performance of fine-tuned model is equivalently excellent to that of pre-trained model. This is likely due to the Mistral 7B has already learned knowledge in this domain. The code example above is just a demonstration on how to fine-tune such model in an instruction way. For your own use case, please substitute the example dolly dataset by yours."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "97e51104-2f8d-45d1-827d-4b955546d383",
   "metadata": {},
   "source": [
    "### 1.6. Clean up the endpoint"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ac39f216-5866-47dd-a264-53f677e7eb9c",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Delete the SageMaker endpoint\n",
    "instruction_tuned_predictor.delete_model()\n",
    "instruction_tuned_predictor.delete_endpoint()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "aaed71e3-8fa9-4992-97d4-e038f834bf41",
   "metadata": {},
   "source": [
    "## 2. Domain adaptation fine-tuning\n",
    "\n",
    "We also have domain adaptation fine-tuning enabled for Mistral models. Different from instruction fine-tuning, you do not need prepare instruction-formatted dataset and can directly use unstructured text document which is demonstrated as below. However, the model that is domain-adaptation fine-tuned may not give concise responses as the instruction-tuned model because of less restrictive requirements on training data formats."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1cf488a2-956a-4e70-8bdd-d0fd05646603",
   "metadata": {},
   "source": [
    "We will use financial text from SEC filings to fine tune Mistral 7B model for financial applications. \n",
    "\n",
    "Here are the requirements for train and validation data.\n",
    "\n",
    "- **Input**: A train and an optional validation directory. Each directory contains a CSV/JSON/TXT file.\n",
    "    - For CSV/JSON files, the train or validation data is used from the column called 'text' or the first column if no column called 'text' is found.\n",
    "    - The number of files under train and validation (if provided) should equal to one.\n",
    "- **Output**: A trained model that can be deployed for inference.\n",
    "\n",
    "Below is an example of a TXT file for fine-tuning the Text Generation model. The TXT file is SEC filings of Amazon from year 2021 to 2022.\n",
    "\n",
    "---\n",
    "```\n",
    "This report includes estimates, projections, statements relating to our\n",
    "business plans, objectives, and expected operating results that are “forward-\n",
    "looking statements” within the meaning of the Private Securities Litigation\n",
    "Reform Act of 1995, Section 27A of the Securities Act of 1933, and Section 21E\n",
    "of the Securities Exchange Act of 1934. Forward-looking statements may appear\n",
    "throughout this report, including the following sections: “Business” (Part I,\n",
    "Item 1 of this Form 10-K), “Risk Factors” (Part I, Item 1A of this Form 10-K),\n",
    "and “Management’s Discussion and Analysis of Financial Condition and Results\n",
    "of Operations” (Part II, Item 7 of this Form 10-K). These forward-looking\n",
    "statements generally are identified by the words “believe,” “project,”\n",
    "“expect,” “anticipate,” “estimate,” “intend,” “strategy,” “future,”\n",
    "“opportunity,” “plan,” “may,” “should,” “will,” “would,” “will be,” “will\n",
    "continue,” “will likely result,” and similar expressions. Forward-looking\n",
    "statements are based on current expectations and assumptions that are subject\n",
    "to risks and uncertainties that may cause actual results to differ materially.\n",
    "We describe risks and uncertainties that could cause actual results and events\n",
    "to differ materially in “Risk Factors,” “Management’s Discussion and Analysis\n",
    "of Financial Condition and Results of Operations,” and “Quantitative and\n",
    "Qualitative Disclosures about Market Risk” (Part II, Item 7A of this Form\n",
    "10-K). Readers are cautioned not to place undue reliance on forward-looking\n",
    "statements, which speak only as of the date they are made. We undertake no\n",
    "obligation to update or revise publicly any forward-looking statements,\n",
    "whether because of new information, future events, or otherwise.\n",
    "\n",
    "...\n",
    "```\n",
    "---\n",
    "SEC filings data of Amazon is downloaded from publicly available [EDGAR](https://www.sec.gov/edgar/searchedgar/companysearch). Instruction of accessing the data is shown [here](https://www.sec.gov/os/accessing-edgar-data)."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "95d136eb-47d1-4555-bc4b-3a39d1ef2b6c",
   "metadata": {},
   "source": [
    "### 2.1. Preparing training data\n",
    "\n",
    "The training data of SEC filing of Amazon has been pre-saved in the S3 bucket."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b3c71845-58e1-42ca-891f-b22e44c911d2",
   "metadata": {},
   "outputs": [],
   "source": [
    "from sagemaker.jumpstart.utils import get_jumpstart_content_bucket\n",
    "\n",
    "# Sample training data is available in this bucket\n",
    "data_bucket = get_jumpstart_content_bucket(aws_region)\n",
    "data_prefix = \"training-datasets/sec_data\"\n",
    "\n",
    "training_dataset_s3_path = f\"s3://{data_bucket}/{data_prefix}/train/\"\n",
    "validation_dataset_s3_path = f\"s3://{data_bucket}/{data_prefix}/validation/\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4c322e99-d9d3-47b2-8489-0cfff08a1f9d",
   "metadata": {},
   "source": [
    "### 2.2. Prepare training parameters\n",
    "\n",
    "We pick the `max_input_length` to be 2048 on `g5.12xlarge`. You can use higher input length on larger instance type."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "11534a56-ae96-49e2-a8e2-2852a6dfdbf1",
   "metadata": {},
   "outputs": [],
   "source": [
    "from sagemaker import hyperparameters\n",
    "\n",
    "my_hyperparameters = hyperparameters.retrieve_default(\n",
    "    model_id=model_id, model_version=model_version\n",
    ")\n",
    "\n",
    "my_hyperparameters[\"epoch\"] = \"3\"\n",
    "my_hyperparameters[\"per_device_train_batch_size\"] = \"2\"\n",
    "my_hyperparameters[\"instruction_tuned\"] = \"False\"\n",
    "my_hyperparameters[\"max_input_length\"] = \"2048\"\n",
    "print(my_hyperparameters)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0b262c88",
   "metadata": {},
   "source": [
    "Validate hyperparameters"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e9877c3a",
   "metadata": {},
   "outputs": [],
   "source": [
    "hyperparameters.validate(\n",
    "    model_id=model_id, model_version=model_version, hyperparameters=my_hyperparameters\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cd26bc12-d008-4ad6-9193-57e002ab9d67",
   "metadata": {},
   "source": [
    "### 2.3. Starting training"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b64e682d-586e-464c-b28f-451d036d562e",
   "metadata": {},
   "outputs": [],
   "source": [
    "from sagemaker.jumpstart.estimator import JumpStartEstimator\n",
    "\n",
    "domain_adaptation_estimator = JumpStartEstimator(\n",
    "    model_id=model_id,\n",
    "    hyperparameters=my_hyperparameters,\n",
    "    instance_type=\"ml.g5.12xlarge\",\n",
    ")\n",
    "domain_adaptation_estimator.fit(\n",
    "    {\"train\": training_dataset_s3_path, \"validation\": validation_dataset_s3_path}, logs=True\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "03751984-5ca5-473a-bf34-8d66295c0e39",
   "metadata": {},
   "source": [
    "Extract Training performance metrics. Performance metrics such as training loss and validation accuracy/loss can be accessed through cloudwatch while the training. We can also fetch these metrics and analyze them within the notebook"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0960e861-635a-43c0-a60d-6f4a41c105bb",
   "metadata": {},
   "outputs": [],
   "source": [
    "from sagemaker import TrainingJobAnalytics\n",
    "\n",
    "training_job_name = domain_adaptation_estimator.latest_training_job.job_name\n",
    "\n",
    "df = TrainingJobAnalytics(training_job_name=training_job_name).dataframe()\n",
    "df.head(10)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6f3ca2e1-d968-434e-85f4-8605350c0ed4",
   "metadata": {},
   "source": [
    "### 2.4. Deploying inference endpoints\n",
    "\n",
    "We deploy the domain-adaptation fine-tuned and pretrained models separately, and compare their performances.\n",
    "\n",
    "We firstly deploy the domain-adaptation fine-tuned model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "da369cd9-e3ad-4981-88c9-ac68022c3251",
   "metadata": {},
   "outputs": [],
   "source": [
    "domain_adaptation_predictor = domain_adaptation_estimator.deploy()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d86815ad-ef6e-47a3-bfc3-b2bb7ab1886a",
   "metadata": {},
   "source": [
    "Next, we deploy the pre-trained `huggingface-llm-mistral-7b`. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "70f1bf93-34d4-4a47-8d8e-76403759445a",
   "metadata": {},
   "outputs": [],
   "source": [
    "from sagemaker.jumpstart.model import JumpStartModel\n",
    "\n",
    "my_model = JumpStartModel(model_id=model_id)\n",
    "pretrained_predictor = my_model.deploy()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "16cc6024-87be-495a-a541-4f0132c306e2",
   "metadata": {},
   "source": [
    "### 2.5. Running inference queries and compare model performances"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4bf1c732-636f-4473-8c28-3b1fcbb44453",
   "metadata": {},
   "outputs": [],
   "source": [
    "parameters = {\n",
    "    \"max_new_tokens\": 300,\n",
    "    \"top_k\": 50,\n",
    "    \"top_p\": 0.8,\n",
    "    \"do_sample\": True,\n",
    "    \"temperature\": 1,\n",
    "}\n",
    "\n",
    "\n",
    "def query_endpoint_with_json_payload(encoded_json, endpoint_name):\n",
    "    client = boto3.client(\"runtime.sagemaker\")\n",
    "    response = client.invoke_endpoint(\n",
    "        EndpointName=endpoint_name, ContentType=\"application/json\", Body=encoded_json\n",
    "    )\n",
    "    return response\n",
    "\n",
    "\n",
    "def parse_response(query_response):\n",
    "    model_predictions = json.loads(query_response[\"Body\"].read())\n",
    "    return model_predictions[0][\"generated_text\"]\n",
    "\n",
    "\n",
    "def generate_response(endpoint_name, text):\n",
    "    payload = {\"inputs\": f\"{text}:\", \"parameters\": parameters}\n",
    "    query_response = query_endpoint_with_json_payload(\n",
    "        json.dumps(payload).encode(\"utf-8\"), endpoint_name=endpoint_name\n",
    "    )\n",
    "    generated_texts = parse_response(query_response)\n",
    "    print(f\"Response: {generated_texts}{newline}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2db1353e-1afb-4a5a-b6c5-f406ea994476",
   "metadata": {},
   "outputs": [],
   "source": [
    "newline, bold, unbold = \"\\n\", \"\\033[1m\", \"\\033[0m\"\n",
    "\n",
    "test_paragraph_domain_adaption = [\n",
    "    \"This Form 10-K report shows that\",\n",
    "    \"We serve consumers through\",\n",
    "    \"Our vision is\",\n",
    "]\n",
    "\n",
    "\n",
    "for paragraph in test_paragraph_domain_adaption:\n",
    "    print(\"-\" * 80)\n",
    "    print(paragraph)\n",
    "    print(\"-\" * 80)\n",
    "    print(f\"{bold}pre-trained{unbold}\")\n",
    "    generate_response(pretrained_predictor.endpoint_name, paragraph)\n",
    "    print(f\"{bold}fine-tuned{unbold}\")\n",
    "    generate_response(domain_adaptation_predictor.endpoint_name, paragraph)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9dca5bdd",
   "metadata": {},
   "source": [
    "As you can, the fine-tuned model starts to generate responses that are more specific to the domain of fine-tuning data which is relating to SEC report of Amazon."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b04d99e2",
   "metadata": {},
   "source": [
    "### 2.6. Clean up the endpoint"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6c3b60c2",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# Delete the SageMaker endpoint\n",
    "pretrained_predictor.delete_model()\n",
    "pretrained_predictor.delete_endpoint()\n",
    "domain_adaptation_predictor.delete_model()\n",
    "domain_adaptation_predictor.delete_endpoint()"
   ]
  }
 ],
 "metadata": {
  "availableInstances": [
   {
    "_defaultOrder": 0,
    "_isFastLaunch": true,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 4,
    "name": "ml.t3.medium",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 1,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 8,
    "name": "ml.t3.large",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 2,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.t3.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 3,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.t3.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 4,
    "_isFastLaunch": true,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 8,
    "name": "ml.m5.large",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 5,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.m5.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 6,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.m5.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 7,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 64,
    "name": "ml.m5.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 8,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 128,
    "name": "ml.m5.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 9,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 192,
    "name": "ml.m5.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 10,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 256,
    "name": "ml.m5.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 11,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 384,
    "name": "ml.m5.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 12,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 8,
    "name": "ml.m5d.large",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 13,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.m5d.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 14,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.m5d.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 15,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 64,
    "name": "ml.m5d.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 16,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 128,
    "name": "ml.m5d.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 17,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 192,
    "name": "ml.m5d.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 18,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 256,
    "name": "ml.m5d.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 19,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 384,
    "name": "ml.m5d.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 20,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": true,
    "memoryGiB": 0,
    "name": "ml.geospatial.interactive",
    "supportedImageNames": [
     "sagemaker-geospatial-v1-0"
    ],
    "vcpuNum": 0
   },
   {
    "_defaultOrder": 21,
    "_isFastLaunch": true,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 4,
    "name": "ml.c5.large",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 22,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 8,
    "name": "ml.c5.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 23,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.c5.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 24,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.c5.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 25,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 72,
    "name": "ml.c5.9xlarge",
    "vcpuNum": 36
   },
   {
    "_defaultOrder": 26,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 96,
    "name": "ml.c5.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 27,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 144,
    "name": "ml.c5.18xlarge",
    "vcpuNum": 72
   },
   {
    "_defaultOrder": 28,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 192,
    "name": "ml.c5.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 29,
    "_isFastLaunch": true,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.g4dn.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 30,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.g4dn.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 31,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 64,
    "name": "ml.g4dn.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 32,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 128,
    "name": "ml.g4dn.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 33,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 4,
    "hideHardwareSpecs": false,
    "memoryGiB": 192,
    "name": "ml.g4dn.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 34,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 256,
    "name": "ml.g4dn.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 35,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 61,
    "name": "ml.p3.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 36,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 4,
    "hideHardwareSpecs": false,
    "memoryGiB": 244,
    "name": "ml.p3.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 37,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 8,
    "hideHardwareSpecs": false,
    "memoryGiB": 488,
    "name": "ml.p3.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 38,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 8,
    "hideHardwareSpecs": false,
    "memoryGiB": 768,
    "name": "ml.p3dn.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 39,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.r5.large",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 40,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.r5.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 41,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 64,
    "name": "ml.r5.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 42,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 128,
    "name": "ml.r5.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 43,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 256,
    "name": "ml.r5.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 44,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 384,
    "name": "ml.r5.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 45,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 512,
    "name": "ml.r5.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 46,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 768,
    "name": "ml.r5.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 47,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.g5.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 48,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.g5.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 49,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 64,
    "name": "ml.g5.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 50,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 128,
    "name": "ml.g5.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 51,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 256,
    "name": "ml.g5.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 52,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 4,
    "hideHardwareSpecs": false,
    "memoryGiB": 192,
    "name": "ml.g5.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 53,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 4,
    "hideHardwareSpecs": false,
    "memoryGiB": 384,
    "name": "ml.g5.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 54,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 8,
    "hideHardwareSpecs": false,
    "memoryGiB": 768,
    "name": "ml.g5.48xlarge",
    "vcpuNum": 192
   },
   {
    "_defaultOrder": 55,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 8,
    "hideHardwareSpecs": false,
    "memoryGiB": 1152,
    "name": "ml.p4d.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 56,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 8,
    "hideHardwareSpecs": false,
    "memoryGiB": 1152,
    "name": "ml.p4de.24xlarge",
    "vcpuNum": 96
   }
  ],
  "instance_type": "ml.t3.medium",
  "kernelspec": {
   "display_name": "Python 3 (Data Science 3.0)",
   "language": "python",
   "name": "python3__SAGEMAKER_INTERNAL__arn:aws:sagemaker:us-east-1:081325390199:image/sagemaker-data-science-310-v1"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}