Skip to main content

AWS Bedrock

ALL Bedrock models (Anthropic, Meta, Mistral, Amazon, etc.) are Supported

PropertyDetails
DescriptionAmazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs).
Provider Route on LiteLLMbedrock/, bedrock/converse/, bedrock/invoke/, bedrock/converse_like/
Provider DocAmazon Bedrock ↗
Supported OpenAI Endpoints/chat/completions, /completions, /embeddings, /images/generations
Pass-through EndpointSupported

LiteLLM requires boto3 to be installed on your system for Bedrock requests

pip install boto3>=1.28.57
info

For Amazon Nova Models: Bump to v1.53.5+

info

LiteLLM uses boto3 to handle authentication. All these options are supported - https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html#credentials.

Usage​

Open In Colab
import os
from litellm import completion

os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""

response = completion(
model="bedrock/anthropic.claude-3-sonnet-20240229-v1:0",
messages=[{ "content": "Hello, how are you?","role": "user"}]
)

LiteLLM Proxy Usage​

Here's how to call Bedrock with the LiteLLM Proxy Server

1. Setup config.yaml​

model_list:
- model_name: bedrock-claude-v1
litellm_params:
model: bedrock/anthropic.claude-instant-v1
aws_access_key_id: os.environ/CUSTOM_AWS_ACCESS_KEY_ID
aws_secret_access_key: os.environ/CUSTOM_AWS_SECRET_ACCESS_KEY
aws_region_name: os.environ/CUSTOM_AWS_REGION_NAME

All possible auth params:

aws_access_key_id: Optional[str],
aws_secret_access_key: Optional[str],
aws_session_token: Optional[str],
aws_region_name: Optional[str],
aws_session_name: Optional[str],
aws_profile_name: Optional[str],
aws_role_name: Optional[str],
aws_web_identity_token: Optional[str],

2. Start the proxy​

litellm --config /path/to/config.yaml

3. Test it​

curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
"model": "bedrock-claude-v1",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
]
}
'

Set temperature, top p, etc.​

import os
from litellm import completion

os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""

response = completion(
model="bedrock/anthropic.claude-3-sonnet-20240229-v1:0",
messages=[{ "content": "Hello, how are you?","role": "user"}],
temperature=0.7,
top_p=1
)

Pass provider-specific params​

If you pass a non-openai param to litellm, we'll assume it's provider-specific and send it as a kwarg in the request body. See more

import os
from litellm import completion

os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""

response = completion(
model="bedrock/anthropic.claude-3-sonnet-20240229-v1:0",
messages=[{ "content": "Hello, how are you?","role": "user"}],
top_k=1 # 👈 PROVIDER-SPECIFIC PARAM
)

Usage - Function Calling​

LiteLLM uses Bedrock's Converse API for making tool calls

from litellm import completion

# set env
os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""

tools = [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA",
},
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
},
"required": ["location"],
},
},
}
]
messages = [{"role": "user", "content": "What's the weather like in Boston today?"}]

response = completion(
model="bedrock/anthropic.claude-3-sonnet-20240229-v1:0",
messages=messages,
tools=tools,
tool_choice="auto",
)
# Add any assertions, here to check response args
print(response)
assert isinstance(response.choices[0].message.tool_calls[0].function.name, str)
assert isinstance(
response.choices[0].message.tool_calls[0].function.arguments, str
)

Usage - Vision​

from litellm import completion

# set env
os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""


def encode_image(image_path):
import base64

with open(image_path, "rb") as image_file:
return base64.b64encode(image_file.read()).decode("utf-8")


image_path = "../proxy/cached_logo.jpg"
# Getting the base64 string
base64_image = encode_image(image_path)
resp = litellm.completion(
model="bedrock/anthropic.claude-3-sonnet-20240229-v1:0",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "Whats in this image?"},
{
"type": "image_url",
"image_url": {
"url": "data:image/jpeg;base64," + base64_image
},
},
],
}
],
)
print(f"\nResponse: {resp}")

Usage - Bedrock Guardrails​

Example of using Bedrock Guardrails with LiteLLM

from litellm import completion

# set env
os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""

response = completion(
model="anthropic.claude-v2",
messages=[
{
"content": "where do i buy coffee from? ",
"role": "user",
}
],
max_tokens=10,
guardrailConfig={
"guardrailIdentifier": "ff6ujrregl1q", # The identifier (ID) for the guardrail.
"guardrailVersion": "DRAFT", # The version of the guardrail.
"trace": "disabled", # The trace behavior for the guardrail. Can either be "disabled" or "enabled"
},
)

Usage - "Assistant Pre-fill"​

If you're using Anthropic's Claude with Bedrock, you can "put words in Claude's mouth" by including an assistant role message as the last item in the messages array.

[!IMPORTANT] The returned completion will not include your "pre-fill" text, since it is part of the prompt itself. Make sure to prefix Claude's completion with your pre-fill.

import os
from litellm import completion

os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""

messages = [
{"role": "user", "content": "How do you say 'Hello' in German? Return your answer as a JSON object, like this:\n\n{ \"Hello\": \"Hallo\" }"},
{"role": "assistant", "content": "{"},
]
response = completion(model="bedrock/anthropic.claude-v2", messages=messages)

Example prompt sent to Claude​


Human: How do you say 'Hello' in German? Return your answer as a JSON object, like this:

{ "Hello": "Hallo" }

Assistant: {

Usage - "System" messages​

If you're using Anthropic's Claude 2.1 with Bedrock, system role messages are properly formatted for you.

import os
from litellm import completion

os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""

messages = [
{"role": "system", "content": "You are a snarky assistant."},
{"role": "user", "content": "How do I boil water?"},
]
response = completion(model="bedrock/anthropic.claude-v2:1", messages=messages)

Example prompt sent to Claude​

You are a snarky assistant.

Human: How do I boil water?

Assistant:

Usage - Streaming​

import os
from litellm import completion

os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""

response = completion(
model="bedrock/anthropic.claude-instant-v1",
messages=[{ "content": "Hello, how are you?","role": "user"}],
stream=True
)
for chunk in response:
print(chunk)

Example Streaming Output Chunk​

{
"choices": [
{
"finish_reason": null,
"index": 0,
"delta": {
"content": "ase can appeal the case to a higher federal court. If a higher federal court rules in a way that conflicts with a ruling from a lower federal court or conflicts with a ruling from a higher state court, the parties involved in the case can appeal the case to the Supreme Court. In order to appeal a case to the Sup"
}
}
],
"created": null,
"model": "anthropic.claude-instant-v1",
"usage": {
"prompt_tokens": null,
"completion_tokens": null,
"total_tokens": null
}
}

Cross-region inferencing​

LiteLLM supports Bedrock cross-region inferencing across all supported bedrock models.

from litellm import completion 
import os


os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""


litellm.set_verbose = True # 👈 SEE RAW REQUEST

response = completion(
model="bedrock/us.anthropic.claude-3-haiku-20240307-v1:0",
messages=messages,
max_tokens=10,
temperature=0.1,
)

print("Final Response: {}".format(response))

Set 'converse' / 'invoke' route​

info

Supported from LiteLLM Version v1.53.5

LiteLLM defaults to the invoke route. LiteLLM uses the converse route for Bedrock models that support it.

To explicitly set the route, do bedrock/converse/<model> or bedrock/invoke/<model>.

E.g.

from litellm import completion

completion(model="bedrock/converse/us.amazon.nova-pro-v1:0")

Alternate user/assistant messages​

Use user_continue_message to add a default user message, for cases (e.g. Autogen) where the client might not follow alternating user/assistant messages starting and ending with a user message.

model_list:
- model_name: "bedrock-claude"
litellm_params:
model: "bedrock/anthropic.claude-instant-v1"
user_continue_message: {"role": "user", "content": "Please continue"}

OR

just set litellm.modify_params=True and LiteLLM will automatically handle this with a default user_continue_message.

model_list:
- model_name: "bedrock-claude"
litellm_params:
model: "bedrock/anthropic.claude-instant-v1"

litellm_settings:
modify_params: true

Test it!

curl -X POST 'http://0.0.0.0:4000/chat/completions' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer sk-1234' \
-d '{
"model": "bedrock-claude",
"messages": [{"role": "assistant", "content": "Hey, how's it going?"}]
}'

Usage - PDF / Document Understanding​

LiteLLM supports Document Understanding for Bedrock models - AWS Bedrock Docs.

info

LiteLLM supports ALL Bedrock document types -

E.g.: "pdf", "csv", "doc", "docx", "xls", "xlsx", "html", "txt", "md"

You can also pass these as either image_url or base64

url​

from litellm.utils import supports_pdf_input, completion

# set aws credentials
os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""


# pdf url
image_url = "https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf"

# model
model = "bedrock/anthropic.claude-3-5-sonnet-20240620-v1:0"

image_content = [
{"type": "text", "text": "What's this file about?"},
{
"type": "image_url",
"image_url": image_url, # OR {"url": image_url}
},
]


if not supports_pdf_input(model, None):
print("Model does not support image input")

response = completion(
model=model,
messages=[{"role": "user", "content": image_content}],
)
assert response is not None

base64​

from litellm.utils import supports_pdf_input, completion

# set aws credentials
os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""


# pdf url
image_url = "https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf"
response = requests.get(url)
file_data = response.content

encoded_file = base64.b64encode(file_data).decode("utf-8")
base64_url = f"data:application/pdf;base64,{encoded_file}"

# model
model = "bedrock/anthropic.claude-3-5-sonnet-20240620-v1:0"

image_content = [
{"type": "text", "text": "What's this file about?"},
{
"type": "image_url",
"image_url": base64_url, # OR {"url": base64_url}
},
]


if not supports_pdf_input(model, None):
print("Model does not support image input")

response = completion(
model=model,
messages=[{"role": "user", "content": image_content}],
)
assert response is not None

Boto3 - Authentication​

Passing credentials as parameters - Completion()​

Pass AWS credentials as parameters to litellm.completion

import os
from litellm import completion

response = completion(
model="bedrock/anthropic.claude-instant-v1",
messages=[{ "content": "Hello, how are you?","role": "user"}],
aws_access_key_id="",
aws_secret_access_key="",
aws_region_name="",
)

Passing extra headers + Custom API Endpoints​

This can be used to override existing headers (e.g. Authorization) when calling custom api endpoints

import os
import litellm
from litellm import completion

litellm.set_verbose = True # 👈 SEE RAW REQUEST

response = completion(
model="bedrock/anthropic.claude-instant-v1",
messages=[{ "content": "Hello, how are you?","role": "user"}],
aws_access_key_id="",
aws_secret_access_key="",
aws_region_name="",
aws_bedrock_runtime_endpoint="https://my-fake-endpoint.com",
extra_headers={"key": "value"}
)

SSO Login (AWS Profile)​

  • Set AWS_PROFILE environment variable
  • Make bedrock completion call
import os
from litellm import completion

response = completion(
model="bedrock/anthropic.claude-instant-v1",
messages=[{ "content": "Hello, how are you?","role": "user"}]
)

or pass aws_profile_name:

import os
from litellm import completion

response = completion(
model="bedrock/anthropic.claude-instant-v1",
messages=[{ "content": "Hello, how are you?","role": "user"}],
aws_profile_name="dev-profile",
)

STS (Role-based Auth)​

  • Set aws_role_name and aws_session_name
LiteLLM ParameterBoto3 ParameterDescriptionBoto3 Documentation
aws_access_key_idaws_access_key_idAWS access key associated with an IAM user or roleCredentials
aws_secret_access_keyaws_secret_access_keyAWS secret key associated with the access keyCredentials
aws_role_nameRoleArnThe Amazon Resource Name (ARN) of the role to assumeAssumeRole API
aws_session_nameRoleSessionNameAn identifier for the assumed role sessionAssumeRole API

Make the bedrock completion call

from litellm import completion

response = completion(
model="bedrock/anthropic.claude-instant-v1",
messages=messages,
max_tokens=10,
temperature=0.1,
aws_role_name=aws_role_name,
aws_session_name="my-test-session",
)

If you also need to dynamically set the aws user accessing the role, add the additional args in the completion()/embedding() function

from litellm import completion

response = completion(
model="bedrock/anthropic.claude-instant-v1",
messages=messages,
max_tokens=10,
temperature=0.1,
aws_region_name=aws_region_name,
aws_access_key_id=aws_access_key_id,
aws_secret_access_key=aws_secret_access_key,
aws_role_name=aws_role_name,
aws_session_name="my-test-session",
)

Passing an external BedrockRuntime.Client as a parameter - Completion()​

danger

This is a deprecated flow. Boto3 is not async. And boto3.client does not let us make the http call through httpx. Pass in your aws params through the method above 👆. See Auth Code Add new auth flow

Experimental - 2024-Jun-23: aws_access_key_id, aws_secret_access_key, and aws_session_token will be extracted from boto3.client and be passed into the httpx client

Pass an external BedrockRuntime.Client object as a parameter to litellm.completion. Useful when using an AWS credentials profile, SSO session, assumed role session, or if environment variables are not available for auth.

Create a client from session credentials:

import boto3
from litellm import completion

bedrock = boto3.client(
service_name="bedrock-runtime",
region_name="us-east-1",
aws_access_key_id="",
aws_secret_access_key="",
aws_session_token="",
)

response = completion(
model="bedrock/anthropic.claude-instant-v1",
messages=[{ "content": "Hello, how are you?","role": "user"}],
aws_bedrock_client=bedrock,
)

Create a client from AWS profile in ~/.aws/config:

import boto3
from litellm import completion

dev_session = boto3.Session(profile_name="dev-profile")
bedrock = dev_session.client(
service_name="bedrock-runtime",
region_name="us-east-1",
)

response = completion(
model="bedrock/anthropic.claude-instant-v1",
messages=[{ "content": "Hello, how are you?","role": "user"}],
aws_bedrock_client=bedrock,
)

Calling via Internal Proxy​

Use the bedrock/converse_like/model endpoint to call bedrock converse model via your internal proxy.

from litellm import completion

response = completion(
model="bedrock/converse_like/some-model",
messages=[{"role": "user", "content": "What's AWS?"}],
api_key="sk-1234",
api_base="https://some-api-url/models",
extra_headers={"test": "hello world"},
)

Expected Output URL

https://some-api-url/models

Provisioned throughput models​

To use provisioned throughput Bedrock models pass

  • model=bedrock/<base-model>, example model=bedrock/anthropic.claude-v2. Set model to any of the Supported AWS models
  • model_id=provisioned-model-arn

Completion

import litellm
response = litellm.completion(
model="bedrock/anthropic.claude-instant-v1",
model_id="provisioned-model-arn",
messages=[{"content": "Hello, how are you?", "role": "user"}]
)

Embedding

import litellm
response = litellm.embedding(
model="bedrock/amazon.titan-embed-text-v1",
model_id="provisioned-model-arn",
input=["hi"],
)

Supported AWS Bedrock Models​

Here's an example of using a bedrock model with LiteLLM. For a complete list, refer to the model cost map

Model NameCommand
Anthropic Claude-V3.5 Sonnetcompletion(model='bedrock/anthropic.claude-3-5-sonnet-20240620-v1:0', messages=messages)
Anthropic Claude-V3 sonnetcompletion(model='bedrock/anthropic.claude-3-sonnet-20240229-v1:0', messages=messages)
Anthropic Claude-V3 Haikucompletion(model='bedrock/anthropic.claude-3-haiku-20240307-v1:0', messages=messages)
Anthropic Claude-V3 Opuscompletion(model='bedrock/anthropic.claude-3-opus-20240229-v1:0', messages=messages)
Anthropic Claude-V2.1completion(model='bedrock/anthropic.claude-v2:1', messages=messages)
Anthropic Claude-V2completion(model='bedrock/anthropic.claude-v2', messages=messages)
Anthropic Claude-Instant V1completion(model='bedrock/anthropic.claude-instant-v1', messages=messages)
Meta llama3-1-405bcompletion(model='bedrock/meta.llama3-1-405b-instruct-v1:0', messages=messages)
Meta llama3-1-70bcompletion(model='bedrock/meta.llama3-1-70b-instruct-v1:0', messages=messages)
Meta llama3-1-8bcompletion(model='bedrock/meta.llama3-1-8b-instruct-v1:0', messages=messages)
Meta llama3-70bcompletion(model='bedrock/meta.llama3-70b-instruct-v1:0', messages=messages)
Meta llama3-8bcompletion(model='bedrock/meta.llama3-8b-instruct-v1:0', messages=messages)
Amazon Titan Litecompletion(model='bedrock/amazon.titan-text-lite-v1', messages=messages)
Amazon Titan Expresscompletion(model='bedrock/amazon.titan-text-express-v1', messages=messages)
Cohere Commandcompletion(model='bedrock/cohere.command-text-v14', messages=messages)
AI21 J2-Midcompletion(model='bedrock/ai21.j2-mid-v1', messages=messages)
AI21 J2-Ultracompletion(model='bedrock/ai21.j2-ultra-v1', messages=messages)
AI21 Jamba-Instructcompletion(model='bedrock/ai21.jamba-instruct-v1:0', messages=messages)
Meta Llama 2 Chat 13bcompletion(model='bedrock/meta.llama2-13b-chat-v1', messages=messages)
Meta Llama 2 Chat 70bcompletion(model='bedrock/meta.llama2-70b-chat-v1', messages=messages)
Mistral 7B Instructcompletion(model='bedrock/mistral.mistral-7b-instruct-v0:2', messages=messages)
Mixtral 8x7B Instructcompletion(model='bedrock/mistral.mixtral-8x7b-instruct-v0:1', messages=messages)

Bedrock Embedding​

API keys​

This can be set as env variables or passed as params to litellm.embedding()

import os
os.environ["AWS_ACCESS_KEY_ID"] = "" # Access key
os.environ["AWS_SECRET_ACCESS_KEY"] = "" # Secret access key
os.environ["AWS_REGION_NAME"] = "" # us-east-1, us-east-2, us-west-1, us-west-2

Usage​

from litellm import embedding
response = embedding(
model="bedrock/amazon.titan-embed-text-v1",
input=["good morning from litellm"],
)
print(response)

Supported AWS Bedrock Embedding Models​

Model NameUsageSupported Additional OpenAI params
Titan Embeddings V2embedding(model="bedrock/amazon.titan-embed-text-v2:0", input=input)here
Titan Embeddings - V1embedding(model="bedrock/amazon.titan-embed-text-v1", input=input)here
Titan Multimodal Embeddingsembedding(model="bedrock/amazon.titan-embed-image-v1", input=input)here
Cohere Embeddings - Englishembedding(model="bedrock/cohere.embed-english-v3", input=input)here
Cohere Embeddings - Multilingualembedding(model="bedrock/cohere.embed-multilingual-v3", input=input)here

Advanced - Drop Unsupported Params​

Advanced - Pass model/provider-specific Params​

Image Generation​

Use this for stable diffusion on bedrock

Usage​

import os
from litellm import image_generation

os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""

response = image_generation(
prompt="A cute baby sea otter",
model="bedrock/stability.stable-diffusion-xl-v0",
)
print(f"response: {response}")

Set optional params

import os
from litellm import image_generation

os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""

response = image_generation(
prompt="A cute baby sea otter",
model="bedrock/stability.stable-diffusion-xl-v0",
### OPENAI-COMPATIBLE ###
size="128x512", # width=128, height=512
### PROVIDER-SPECIFIC ### see `AmazonStabilityConfig` in bedrock.py for all params
seed=30
)
print(f"response: {response}")

Supported AWS Bedrock Image Generation Models​

Model NameFunction Call
Stable Diffusion 3 - v0embedding(model="bedrock/stability.stability.sd3-large-v1:0", prompt=prompt)
Stable Diffusion - v0embedding(model="bedrock/stability.stable-diffusion-xl-v0", prompt=prompt)
Stable Diffusion - v0embedding(model="bedrock/stability.stable-diffusion-xl-v1", prompt=prompt)

Rerank API​

Use Bedrock's Rerank API in the Cohere /rerank format.

Supported Cohere Rerank Params

  • model - the foundation model ARN
  • query - the query to rerank against
  • documents - the list of documents to rerank
  • top_n - the number of results to return
from litellm import rerank
import os

os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""

response = rerank(
model="bedrock/arn:aws:bedrock:us-west-2::foundation-model/amazon.rerank-v1:0", # provide the model ARN - get this here https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock/client/list_foundation_models.html
query="hello",
documents=["hello", "world"],
top_n=2,
)

print(response)