Prompts

Prompt engineering is a core activity in AI engineering. Braintrust allows you to create prompts, test them out in the playground, use them in your code, update them, and track their performance over time. Our goal is to provide a world-class authoring experience in Braintrust, seamlessly, securely, and reliably integrate them into your code, and debug issues as they arise.

Creating a prompt

To create a prompt, navigate to your Library in the top menu bar and select Prompts, then Create prompt. Pick a name and unique slug for your prompt. The slug is an identifier that you can use to reference it in your code. As you change the prompt's name, description, or contents, its slug stays constant.

Create a prompt

Prompts can use mustache templating syntax to refer to variables. These variables are substituted automatically in the API, playground, and using the .build() function in your code. More on that below.

In code

To create a prompt in code, you can write a script and push it to Braintrust:

summarizer.ts
import * as braintrust from "braintrust";
 
const project = braintrust.projects.create({
  name: "Summarizer",
});
 
export const summarizer = project.prompts.create({
  name: "Summarizer",
  slug: "summarizer",
  description: "Summarize text",
  model: "claude-3-5-sonnet-latest",
  messages: [
    {
      role: "system",
      content: "You are a helpful assistant that can summarize text.",
    },
    {
      role: "user",
      content: "{{{text}}}",
    },
  ],
});
npx braintrust push summarizer.ts

Each prompt change is versioned, e.g. 5878bd218351fb8e. You can use this identifier to pin a specific version of the prompt in your code.

Update a prompt

You can use this identifier to refer to a specific version of the prompt in your code.

Adding few-shot examples to a prompt

You can also use mustache syntax to add few-shot examples to your prompt. For example:

Use the following few shots:
 
{{#input.few_shots}}
input: {{input}}
output: {{output}}
{{/input.few_shots}}

Testing in the playground

While developing a prompt, it can be useful to test it out on real-world data in the Playground. You can open a prompt in the playground, tweak it, and save a new version once you're ready.

Playground

Structured outputs

When using prompts in the playground, you can also define the JSON schema for the structured output of the prompt. Like tool calls, the returned value is parsed as a JSON object automatically.

Structured outputs

For example:

type: object
properties:
  steps:
    type: array
    items:
      type: object
      properties:
        explanation:
          type: string
        output:
          type: string
      required:
        - explanation
        - output
      additionalProperties: false
  final_answer:
    type: string
required:
  - steps
  - final_answer
additionalProperties: false

This JSON object corresponds to the response_format.json_schema argument in the OpenAI API, so this feature currently only works for OpenAI models.

Using tools

You can use any custom tools you've created during prompt execution. To reference a tool when creating a prompt via the SDK, add the names of the tools you want to use to the tools parameter:

import * as braintrust from "braintrust";
 
const project = braintrust.projects.create({
  name: "RAG app",
});
 
export const docSearch = project.prompts.create({
  name: "Doc Search",
  slug: "document-search",
  description:
    "Search through the Braintrust documentation to answer the user's question",
  model: "gpt-4o-mini",
  messages: [
    {
      role: "system",
      content:
        "You are a helpful assistant that can " +
        "answer questions about the Braintrust documentation.",
    },
    {
      role: "user",
      content: "{{{question}}}",
    },
  ],
  tools: [toolRAG],
});

In Python, the prompt and the tool need to be defined in the same file and pushed to Braintrust together. In TypeScript, they can be defined and pushed separately.

To add a tool to a prompt via the UI, select the Tools dropdown in the prompt creation window and select a tool from your library, then save the prompt.

Invoke github tool

For more information about creating and using tools, check out the Tools guide.

Using prompts in your code

Executing directly

In Braintrust, a prompt is a simple function that can be invoked directly through the SDK and REST API. When invoked, prompt functions leverage the proxy to access a wide range of providers and models with managed secrets, and are automatically traced and logged to your Braintrust project. All functions are fully managed and versioned via the UI and API.

Functions are a broad concept that encompass prompts, code snippets, HTTP endpoints, and more. When using the functions API, you can use a prompt's slug or ID as the function's slug or ID, respectively. To learn more about functions, see the functions reference.

import { invoke } from "braintrust";
 
async function main() {
  const result = await invoke({
    projectName: "your project name",
    slug: "your prompt slug",
    input: {
      // These variables map to the template parameters in your prompt.
      question: "1+1",
    },
  });
  console.log(result);
}
 
main();

The return value, result, is a string unless you have tool calls, in which case it returns the arguments of the first tool call. In TypeScript, you can assert this by using the schema argument, which ensures your code matches a particular zod schema:

import { invoke } from "braintrust";
import { z } from "zod";
 
async function main() {
  const result = await invoke({
    projectName: "your project name",
    slug: "your prompt slug",
    input: {
      question: "1+1",
    },
    schema: z.string(),
  });
  console.log(result);
}
 
main();

Adding extra messages

If you're building a chat app, it's often useful to send back additional messages of context as you gather them. You can provide OpenAI-style messages to the invoke function by adding messages, which are appended to the end of the built-in messages:

import { invoke } from "braintrust";
import { z } from "zod";
 
async function reflection(question: string) {
  const result = await invoke({
    projectName: "your project name",
    slug: "your prompt slug",
    input: {
      question,
    },
    schema: z.string(),
  });
  console.log(result);
 
  const reflectionResult = await invoke({
    projectName: "your project name",
    slug: "your prompt slug",
    input: {
      question,
    },
    messages: [
      { role: "assistant", content: result },
      { role: "user", content: "Are you sure about that?" },
    ],
  });
  console.log(reflectionResult);
}
 
reflection("What is larger the Moon or the Earth?");

Streaming

You can also stream results in an easy-to-parse format.

import { invoke } from "braintrust";
 
async function main() {
  const result = await invoke({
    projectName: "your project name",
    slug: "your prompt slug",
    input: {
      question: "1+1",
    },
    stream: true,
  });
 
  for await (const chunk of result) {
    console.log(chunk);
    // { type: "text_delta", data: "The answer "}
    // { type: "text_delta", data: "is 2"}
  }
}
 
main();

Vercel AI SDK

If you're using Next.js and the Vercel AI SDK, you can use the Braintrust adapter by installing the @braintrust/vercel-ai-sdk package and converting the stream to Vercel's format:

import { invoke } from "braintrust";
import { BraintrustAdapter } from "@braintrust/vercel-ai-sdk";
 
export async function POST(req: Request) {
  const stream = await invoke({
    projectName: "your project name",
    slug: "your prompt slug",
    input: await req.json(),
    stream: true,
  });
 
  return BraintrustAdapter.toAIStreamResponse(stream);
}

Logging

invoke uses the active logging state of your application, just like any function decorated with @traced or wrapTraced. This means that if you initialize a logger while calling invoke, it will automatically log spans to Braintrust. By default, invoke requests will log to a root span, but you can customize the name of a span using the name argument. For example:

import { invoke, initLogger, traced } from "braintrust";
 
initLogger({
  projectName: "My project",
});
 
async function main() {
  const result = await traced(
    async (span) => {
      span.log({
        tags: ["foo", "bar"],
      });
      const res = await invoke({
        projectName: "Joker",
        slug: "joker-3c10",
        input: {
          theme: "silicon valley",
        },
      });
      return res;
    },
    {
      name: "My name",
      type: "function",
    },
  );
  console.log(result);
}
 
main().catch(console.error);

will generate a log like this:

Logs with invoke

You can also pass in the parent argument, which is a string that you can derive from span.export() while doing distributed tracing.

Fetching in code

If you'd like to run prompts directly, you can fetch them using the Braintrust SDK. The loadPrompt()/load_prompt() function loads a prompt into a simple format that you can pass along to the OpenAI client. loadPrompt also caches prompts with a two-layered cache, and attempts to use this cache if the prompt cannot be fetched from the Braintrust server:

  1. A memory cache, which stores up to BRAINTRUST_PROMPT_CACHE_MEMORY_MAX prompts in memory. This defaults to 1024.
  2. A disk cache, which stores up to BRAINTRUST_PROMPT_CACHE_DISK_MAX prompts on disk. This defaults to 1048576.

You can also configure the directory used by disk cache by setting the BRAINTRUST_PROMPT_CACHE_DIR environment variable.

import { OpenAI } from "openai";
import { initLogger, loadPrompt, wrapOpenAI } from "braintrust";
 
const logger = initLogger({ projectName: "your project name" });
 
// wrapOpenAI will make sure the client tracks usage of the prompt.
const client = wrapOpenAI(
  new OpenAI({
    apiKey: process.env.OPENAI_API_KEY,
  }),
);
 
async function runPrompt() {
  // Replace with your project name and slug
  const prompt = await loadPrompt({
    projectName: "your project name",
    slug: "your prompt slug",
    defaults: {
      // Parameters to use if not specified
      model: "gpt-3.5-turbo",
      temperature: 0.5,
    },
  });
 
  // Render with parameters
  return client.chat.completions.create(
    prompt.build({
      question: "1+1",
    }),
  );
}

If you need to use another model provider, then you can use the Braintrust proxy to access a wide range of models using the OpenAI format. You can also grab the messages and other parameters directly from the returned object to use a model library of your choice.

Pinning a specific version

To pin a specific version of a prompt, use the loadPrompt()/load_prompt() function with the version identifier.

const prompt = await loadPrompt({
  projectName: "your project name",
  slug: "your prompt slug",
  version: "5878bd218351fb8e",
});

Pulling prompts locally

You can also download prompts to your local filesystem and ensure a specific version is used via version control. You should use the pull command to:

  • Download prompts to public projects so others can use them
  • Pin your production environment to a specific version without running them through Braintrust on the request path
  • Review changes to prompts in pull requests
$ npx braintrust pull --help
usage: cli.js pull [-h] [--output-dir OUTPUT_DIR] [--project-name PROJECT_NAME] [--project-id PROJECT_ID] [--id ID] [--slug SLUG] [--version VERSION] [--force]
 
optional arguments:
  -h, --help            show this help message and exit
  --output-dir OUTPUT_DIR
                        The directory to output the pulled resources to. If not specified, the current directory is used.
  --project-name PROJECT_NAME
                        The name of the project to pull from. If not specified, all projects are pulled.
  --project-id PROJECT_ID
                        The id of the project to pull from. If not specified, all projects are pulled.
  --id ID               The id of a specific function to pull.
  --slug SLUG           The slug of a specific function to pull.
  --version VERSION     The version to pull. Will pull the latest version of each prompt that is at or before this version.
  --force               Overwrite local files if they have uncommitted changes.

Currently, braintrust pull only supports TypeScript.

When you run braintrust pull, you can specify a project name, prompt slug, or version to pull. If you don't specify any of these, all prompts across projects will be pulled into a separate file per project. For example, if you have a project named Summary

$ npx braintrust pull --project-name "Summary"

will generate the following file:

summary.ts
// This file was automatically generated by braintrust pull. You can
// generate it again by running:
//  $ braintrust pull --project-name "Summary"
// Feel free to edit this file manually, but once you do, you should make sure to
// sync your changes with Braintrust by running:
//  $ braintrust push "braintrust/summary.ts"
 
import braintrust from "braintrust";
 
const project = braintrust.projects.create({
  name: "Summary",
});
 
export const summaryBot = project.prompts.create({
  name: "Summary bot",
  slug: "summary-bot",
  model: "gpt-4o",
  messages: [
    { content: "Summarize the following passage.", role: "system" },
    { content: "{{content}}", role: "user" },
  ],
});

To pin your production environment to a specific version, you can run braintrust pull with the --version flag.

Using a pulled prompt

The prompts.create function generates the same Prompt object as the loadPrompt function. This means you can use a pulled prompt in the same way you would use a normal prompt, e.g. by running prompt.build() and passing the result to client.chat.completions.create() call.

Pushing prompts

Just like with tools, you can push prompts to Braintrust using the push command. Simply change the prompt definition, and then run braintrust push from the command line. Braintrust automatically generates a new version for each pushed prompt.

$ npx braintrust push --help
usage: cli.js push [-h] [--api-key API_KEY] [--org-name ORG_NAME]
                   [--app-url APP_URL] [--env-file ENV_FILE]
                   [--terminate-on-failure] [--tsconfig TSCONFIG]
                   [--if-exists {error,replace,ignore}]
                   [files ...]
 
positional arguments:
  files                 A list of files or directories containing functions to
                        bundle. If no files are specified, the current
                        directory is used.
 
optional arguments:
  -h, --help            show this help message and exit
  --api-key API_KEY     Specify a braintrust api key. If the parameter is not
                        specified, the BRAINTRUST_API_KEY environment variable
                        will be used.
  --org-name ORG_NAME   The name of a specific organization to connect to.
                        This is useful if you belong to multiple.
  --app-url APP_URL     Specify a custom braintrust app url. Defaults to
                        https://www.braintrust.dev. This is only necessary if
                        you are using an experimental version of Braintrust
  --env-file ENV_FILE   A path to a .env file containing environment variables
                        to load (via dotenv).
  --terminate-on-failure
                        If provided, terminates on a failing eval, instead of
                        the default (moving onto the next one).
  --tsconfig TSCONFIG   Specify a custom tsconfig.json file to use.
  --if-exists {error,replace,ignore}
                        What to do if a function with the same slug already
                        exists. 'error' will cause an error and abort.
                        'replace' will overwrite the existing function.
                        'ignore' will ignore the push for this function and
                        continue.

When you run npx braintrust push, you can specify one or more files or directories to push. If you specify a directory, all .ts files under that directory are pushed.

See the example in the guide to tools for more details.

Deployment strategies

It is often useful to use different versions of a prompt in different environments. For example, you might want to use the latest version locally and in staging, but pin a specific version in production. This is simple to setup by conditionally passing a version to loadPrompt()/load_prompt() based on the environment.

const prompt = await loadPrompt({
  projectName: "your project name",
  slug: "your prompt slug",
  version:
    process.env.NODE_ENV === "production" ? "5878bd218351fb8e" : undefined,
});

Chat vs. completion format

In Python, prompt.build() returns a dictionary with chat or completion parameters, depending on the prompt type. In TypeScript, however, prompt.build() accepts an additional parameter (flavor) to specify the format. This allows prompt.build to be used in a more type-safe manner. When you specify a flavor, the SDK also validates that the parameters are correct for that format.

const chatParams = prompt.build(
  {
    question: "1+1",
  },
  {
    // This is the default
    flavor: "chat",
  },
);
 
const completionParams = prompt.build(
  {
    question: "1+1",
  },
  {
    // Pass "completion" to get completion-shaped parameters
    flavor: "completion",
  },
);

Opening from traces

When you use a prompt in your code, Braintrust automatically links spans to the prompt used to generate them. This allows you to click to open a span in the playground, and see the prompt that generated it alongside the input variables. You can even test and save a new version of the prompt directly from the playground.

Open from traces

This workflow is very powerful. It effectively allows you to debug, iterate, and publish changes to your prompts directly within Braintrust. And because Braintrust flexibly allows you to load the latest prompt, a specific version, or even a version controlled artifact, you have a lot of control over how these updates propagate into your production systems.

Using the API

The full lifecycle of prompts - creating, retrieving, modifying, etc. - can be managed through the REST API. See the API docs for more details.