Setup Install the Braintrust SDK and LiteLLM, then set your API keys for the providers you use. The examples below use OpenAI. ```bash uv theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}} uv add braintrust litellm ``` ```bash pip theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}} pip install braintrust litellm ``` ```bash title=".env" theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}} BRAINTRUST_API_KEY= OPENAI_API_KEY= ``` Auto-instrumentation To trace LiteLLM without modifying your application code, call `braintrust.auto_instrument()` before importing LiteLLM. This patches LiteLLM alongside Braintrust’s other supported libraries. ```python Python theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}} import os import braintrust braintrust.auto_instrument() braintrust.init_logger( api_key=os.environ["BRAINTRUST_API_KEY"], project="litellm-example", # Replace with your project name ) import litellm response = litellm.completion( model="gpt-4o-mini", messages=[{"role": "user", "content": "What is the capital of France?"}], ) print(response.choices[0].message.content) ``` To trace LiteLLM without auto-instrumenting other libraries, use `patch_litellm()` instead of `braintrust.auto_instrument()`. ```python Python theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}} from braintrust.integrations.litellm import patch_litellm patch_litellm() import litellm from braintrust import init_logger # Initialize Braintrust logger = init_logger(project="litellm-example") # Use LiteLLM as normal - all calls are automatically traced response = litellm.completion( model="gpt-4o-mini", messages=[{"role": "user", "content": "What is the capital of France?"}] ) ``` Manual instrumentation To trace a specific LiteLLM module instance manually, wrap it yourself with `wrap_litellm()`. Use this when you want to instrument a particular module reference rather than patching the globally-imported `litellm`. ```python Python theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}} import os import litellm from braintrust import init_logger from braintrust.integrations.litellm import wrap_litellm init_logger( api_key=os.environ["BRAINTRUST_API_KEY"], project="litellm-example", # Replace with your project name ) wrap_litellm(litellm) response = litellm.completion( model="gpt-4o-mini", messages=[{"role": "user", "content": "What is the capital of France?"}], ) print(response.choices[0].message.content) ``` What Braintrust traces Braintrust patches LiteLLM’s top-level call entry points and creates an LLM-typed span per call: * Chat completion spans (`Completion`) for `litellm.completion` / `litellm.acompletion`, with messages, model, and request parameters; response choices, token usage, and time-to-first-token for streaming. * Responses API spans (`Response`) for `litellm.responses` / `litellm.aresponses`, with input and request parameters; response output, token usage, and time-to-first-token for streaming. * Image generation spans (`Image Generation`) for `litellm.image_generation` / `litellm.aimage_generation`, with prompt and request parameters; output capturing per-image data (attachment for base64 responses or URL reference for URL responses) plus metadata like output format, size, quality, and image count, plus timing and token usage when reported. * Embedding spans (`Embedding`) for `litellm.embedding` / `litellm.aembedding`, with input text and request parameters; output summarized as the embedding vector dimension (length of the first embedding), plus token usage. * Moderation spans (`Moderation`) for `litellm.moderation` (sync only), with input and request parameters; classification results and token usage when reported. * Speech spans (`Speech`) for `litellm.speech` / `litellm.aspeech`, with text input and request parameters; generated audio captured as an attachment, plus timing. * Transcription spans (`Transcription`) for `litellm.transcription` / `litellm.atranscription`, with the input audio captured as an attachment plus model and request parameters; transcribed text and token usage. * Rerank spans (`Rerank`) for `litellm.rerank` / `litellm.arerank`, with query, documents, and request parameters (plus auto-derived `document_count`); results as a list of `{index, relevance_score}` items (capped at 100, with documents intentionally dropped); token metrics (prompt, completion, total), plus Cohere-style billed-unit metrics (search units, classifications) when the response includes them. * Token usage metrics (prompt, completion, total, plus cached and reasoning tokens when the provider reports them). * Errors captured on every call. Resources * [LiteLLM documentation](https://docs.litellm.ai/) * [Supported providers](https://docs.litellm.ai/docs/providers) * [DSPy integration](/integrations/sdk-integrations/dspy), which combines LiteLLM tracing with DSPy-specific callbacks

AI providers

SDKs

Developer tools

Documentation Index