braintrust
An isomorphic JS library for working with Braintrust. This library contains functionality for running evaluations, logging completions, loading and invoking functions, and more.
braintrust
is distributed as a library on NPM.
It is also open source and available on GitHub.
Quickstart
Install the library with npm (or yarn).
Then, create a file like hello.eval.ts
with the following content:
Finally, run the script with npx braintrust eval hello.eval.ts
.
Classes
- Attachment
- BraintrustState
- BraintrustStream
- CodeFunction
- CodePrompt
- Dataset
- Experiment
- FailedHTTPResponse
- LazyValue
- Logger
- NoopSpan
- Project
- Prompt
- PromptBuilder
- ReadonlyExperiment
- SpanImpl
- ToolBuilder
Interfaces
- AttachmentParams
- BackgroundLoggerOpts
- DataSummary
- DatasetSummary
- Evaluator
- ExperimentSummary
- Exportable
- InvokeFunctionArgs
- LogOptions
- LoginOptions
- MetricSummary
- ObjectMetadata
- ParentExperimentIds
- ParentProjectLogIds
- ReporterBody
- ScoreSummary
- Span
Namespaces
Functions
BaseExperiment
▸ BaseExperiment<Input
, Expected
, Metadata
>(options?
): BaseExperiment
<Input
, Expected
, Metadata
>
Use this to specify that the dataset should actually be the data from a previous (base) experiment. If you do not specify a name, Braintrust will automatically figure out the best base experiment to use based on your git history (or fall back to timestamps).
Type parameters
Name | Type |
---|---|
Input | unknown |
Expected | unknown |
Metadata | extends BaseMetadata = void |
Parameters
Name | Type | Description |
---|---|---|
options | Object | |
options.name? | string | The name of the base experiment to use. If unspecified, Braintrust will automatically figure out the best base using your git history (or fall back to timestamps). |
Returns
BaseExperiment
<Input
, Expected
, Metadata
>
Eval
▸ Eval<Input
, Output
, Expected
, Metadata
, EvalReport
>(name
, evaluator
, reporterOrOpts?
): Promise
<EvalResultWithSummary
<Input
, Output
, Expected
, Metadata
>>
Type parameters
Name | Type |
---|---|
Input | Input |
Output | Output |
Expected | void |
Metadata | extends BaseMetadata = void |
EvalReport | boolean |
Parameters
Name | Type |
---|---|
name | string |
evaluator | Evaluator <Input , Output , Expected , Metadata > |
reporterOrOpts? | string | ReporterDef <EvalReport > | EvalOptions <EvalReport > |
Returns
Promise
<EvalResultWithSummary
<Input
, Output
, Expected
, Metadata
>>
Reporter
▸ Reporter<EvalReport
>(name
, reporter
): ReporterDef
<EvalReport
>
Type parameters
Name |
---|
EvalReport |
Parameters
Name | Type |
---|---|
name | string |
reporter | ReporterBody <EvalReport > |
Returns
ReporterDef
<EvalReport
>
buildLocalSummary
▸ buildLocalSummary(evaluator
, results
): ExperimentSummary
Parameters
Name | Type |
---|---|
evaluator | EvaluatorDef <any , any , any , any > |
results | EvalResult <any , any , any , any >[] |
Returns
createFinalValuePassThroughStream
▸ createFinalValuePassThroughStream<T
>(onFinal
, onError
): TransformStream
<T
, BraintrustStreamChunk
>
Create a stream that passes through the final value of the stream. This is
used to implement BraintrustStream.finalValue()
.
Type parameters
Name | Type |
---|---|
T | extends string | Uint8Array | { data : string ; type : "text_delta" } | { data : string ; type : "json_delta" } | { data : string ; type : "error" } | { data : { message : string ; stream : "stderr" | "stdout" } = sseConsoleEventDataSchema; type : "console" } | { data : { data : string ; event : "error" | "text_delta" | "json_delta" | "console" | "start" | "done" ; format : "code" | "global" | "llm" ; id : string ; name : string ; object_type : "prompt" | "tool" | "scorer" | "task" ; output_type : "completion" | "score" | "any" } = sseProgressEventDataSchema; type : "progress" } | { data : string ; type : "start" } | { data : string ; type : "done" } |
Parameters
Name | Type | Description |
---|---|---|
onFinal | (result : unknown ) => void | A function to call with the final value of the stream. |
onError | (error : unknown ) => void | - |
Returns
TransformStream
<T
, BraintrustStreamChunk
>
A new stream that passes through the final value of the stream.
currentExperiment
▸ currentExperiment(options?
): Experiment
| undefined
Returns the currently-active experiment (set by init). Returns undefined if no current experiment has been set.
Parameters
Name | Type |
---|---|
options? | OptionalStateArg |
Returns
Experiment
| undefined
currentLogger
▸ currentLogger<IsAsyncFlush
>(options?
): Logger
<IsAsyncFlush
> | undefined
Returns the currently-active logger (set by initLogger). Returns undefined if no current logger has been set.
Type parameters
Name | Type |
---|---|
IsAsyncFlush | extends boolean |
Parameters
Name | Type |
---|---|
options? | AsyncFlushArg <IsAsyncFlush > & OptionalStateArg |
Returns
Logger
<IsAsyncFlush
> | undefined
currentSpan
▸ currentSpan(options?
): Span
Return the currently-active span for logging (set by one of the traced
methods). If there is no active span, returns a no-op span object, which supports the same interface as spans but does no logging.
See Span for full details.
Parameters
Name | Type |
---|---|
options? | OptionalStateArg |
Returns
devNullWritableStream
▸ devNullWritableStream(): WritableStream
Returns
WritableStream
flush
▸ flush(options?
): Promise
<void
>
Flush any pending rows to the server.
Parameters
Name | Type |
---|---|
options? | OptionalStateArg |
Returns
Promise
<void
>
getSpanParentObject
▸ getSpanParentObject<IsAsyncFlush
>(options?
): Span
| Experiment
| Logger
<IsAsyncFlush
>
Mainly for internal use. Return the parent object for starting a span in a global context.
Type parameters
Name | Type |
---|---|
IsAsyncFlush | extends boolean |
Parameters
Name | Type |
---|---|
options? | AsyncFlushArg <IsAsyncFlush > & OptionalStateArg |
Returns
Span
| Experiment
| Logger
<IsAsyncFlush
>
init
▸ init<IsOpen
>(options
): InitializedExperiment
<IsOpen
>
Log in, and then initialize a new experiment in a specified project. If the project does not exist, it will be created.
Type parameters
Name | Type |
---|---|
IsOpen | extends boolean = false |
Parameters
Name | Type | Description |
---|---|---|
options | Readonly <FullInitOptions <IsOpen >> | Options for configuring init(). |
Returns
InitializedExperiment
<IsOpen
>
The newly created Experiment.
▸ init<IsOpen
>(project
, options?
): InitializedExperiment
<IsOpen
>
Legacy form of init
which accepts the project name as the first parameter,
separately from the remaining options. See init(options)
for full details.
Type parameters
Name | Type |
---|---|
IsOpen | extends boolean = false |
Parameters
Name | Type |
---|---|
project | string |
options? | Readonly <InitOptions <IsOpen >> |
Returns
InitializedExperiment
<IsOpen
>
initDataset
▸ initDataset<IsLegacyDataset
>(options
): Dataset
<IsLegacyDataset
>
Create a new dataset in a specified project. If the project does not exist, it will be created.
Type parameters
Name | Type |
---|---|
IsLegacyDataset | extends boolean = false |
Parameters
Name | Type | Description |
---|---|---|
options | Readonly <FullInitDatasetOptions <IsLegacyDataset >> | Options for configuring initDataset(). |
Returns
Dataset
<IsLegacyDataset
>
The newly created Dataset.
▸ initDataset<IsLegacyDataset
>(project
, options?
): Dataset
<IsLegacyDataset
>
Legacy form of initDataset
which accepts the project name as the first
parameter, separately from the remaining options.
See initDataset(options)
for full details.
Type parameters
Name | Type |
---|---|
IsLegacyDataset | extends boolean = false |
Parameters
Name | Type |
---|---|
project | string |
options? | Readonly <InitDatasetOptions <IsLegacyDataset >> |
Returns
Dataset
<IsLegacyDataset
>
initExperiment
▸ initExperiment<IsOpen
>(options
): InitializedExperiment
<IsOpen
>
Alias for init(options).
Type parameters
Name | Type |
---|---|
IsOpen | extends boolean = false |
Parameters
Name | Type |
---|---|
options | Readonly <InitOptions <IsOpen >> |
Returns
InitializedExperiment
<IsOpen
>
▸ initExperiment<IsOpen
>(project
, options?
): InitializedExperiment
<IsOpen
>
Alias for init(project, options).
Type parameters
Name | Type |
---|---|
IsOpen | extends boolean = false |
Parameters
Name | Type |
---|---|
project | string |
options? | Readonly <InitOptions <IsOpen >> |
Returns
InitializedExperiment
<IsOpen
>
initLogger
▸ initLogger<IsAsyncFlush
>(options?
): Logger
<IsAsyncFlush
>
Create a new logger in a specified project. If the project does not exist, it will be created.
Type parameters
Name | Type |
---|---|
IsAsyncFlush | extends boolean = false |
Parameters
Name | Type | Description |
---|---|---|
options | Readonly <InitLoggerOptions <IsAsyncFlush >> | Additional options for configuring init(). |
Returns
Logger
<IsAsyncFlush
>
The newly created Logger.
invoke
▸ invoke<Input
, Output
, Stream
>(args
): Promise
<InvokeReturn
<Stream
, Output
>>
Invoke a Braintrust function, returning a BraintrustStream
or the value as a plain
Javascript object.
Type parameters
Name | Type |
---|---|
Input | Input |
Output | Output |
Stream | extends boolean = false |
Parameters
Name | Type | Description |
---|---|---|
args | InvokeFunctionArgs <Input , Output , Stream > & LoginOptions & { forceLogin? : boolean } | The arguments for the function (see InvokeFunctionArgs for more details). |
Returns
Promise
<InvokeReturn
<Stream
, Output
>>
The output of the function.
loadPrompt
▸ loadPrompt(options
): Promise
<Prompt
<true
, true
>>
Load a prompt from the specified project.
Parameters
Name | Type | Description |
---|---|---|
options | LoadPromptOptions | Options for configuring loadPrompt(). |
Returns
Promise
<Prompt
<true
, true
>>
The prompt object.
Throws
If the prompt is not found.
Throws
If multiple prompts are found with the same slug in the same project (this should never happen).
Example
log
▸ log(event
): string
Log a single event to the current experiment. The event will be batched and uploaded behind the scenes.
Parameters
Name | Type | Description |
---|---|---|
event | ExperimentLogFullArgs | The event to log. See Experiment.log for full details. |
Returns
string
The id
of the logged event.
logError
▸ logError(span
, error
): void
Parameters
Name | Type |
---|---|
span | Span |
error | unknown |
Returns
void
login
▸ login(options?
): Promise
<BraintrustState
>
Log into Braintrust. This will prompt you for your API token, which you can find at
https://www.braintrust.dev/app/token. This method is called automatically by init()
.
Parameters
Name | Type | Description |
---|---|---|
options | LoginOptions & { forceLogin? : boolean } | Options for configuring login(). |
Returns
Promise
<BraintrustState
>
loginToState
▸ loginToState(options?
): Promise
<BraintrustState
>
Parameters
Name | Type |
---|---|
options | LoginOptions |
Returns
Promise
<BraintrustState
>
newId
▸ newId(): string
Returns
string
parseCachedHeader
▸ parseCachedHeader(value
): number
| undefined
Parameters
Name | Type |
---|---|
value | undefined | null | string |
Returns
number
| undefined
permalink
▸ permalink(slug
, opts?
): Promise
<string
>
Format a permalink to the Braintrust application for viewing the span
represented by the provided slug
.
Links can be generated at any time, but they will only become viewable after the span and its root have been flushed to the server and ingested.
If you have a Span
object, use Span.permalink instead.
Parameters
Name | Type | Description |
---|---|---|
slug | string | The identifier generated from Span.export. |
opts? | Object | Optional arguments. |
opts.appUrl? | string | The app URL to use. If not provided, the app URL will be inferred from the state. |
opts.orgName? | string | The org name to use. If not provided, the org name will be inferred from the state. |
opts.state? | BraintrustState | The login state to use. If not provided, the global state will be used. |
Returns
Promise
<string
>
A permalink to the exported span.
renderMessage
▸ renderMessage<T
>(render
, message
): T
Type parameters
Name | Type |
---|---|
T | extends { content : string ; name? : string ; role : "system" } | { content : {} ; name? : string ; role : "user" } | { content? : null | string ; function_call? : { arguments : string ; name : string } ; name? : string ; role : "assistant" ; tool_calls? : { function : { arguments : string ; name : string } ; id : string ; type : "function" }[] } | { content : string ; role : "tool" ; tool_call_id : string } | { content : string ; name : string ; role : "function" } | { content? : null | string ; role : "model" } |
Parameters
Name | Type |
---|---|
render | (template : string ) => string |
message | T |
Returns
T
reportFailures
▸ reportFailures<Input
, Output
, Expected
, Metadata
>(evaluator
, failingResults
, «destructured»
): void
Type parameters
Name | Type |
---|---|
Input | Input |
Output | Output |
Expected | Expected |
Metadata | extends BaseMetadata |
Parameters
Name | Type |
---|---|
evaluator | EvaluatorDef <Input , Output , Expected , Metadata > |
failingResults | EvalResult <Input , Output , Expected , Metadata >[] |
«destructured» | ReporterOpts |
Returns
void
setFetch
▸ setFetch(fetch
): void
Set the fetch implementation to use for requests. You can specify it here,
or when you call login
.
Parameters
Name | Type | Description |
---|---|---|
fetch | (input : URL | RequestInfo , init? : RequestInit ) => Promise <Response >(input : string | URL | Request , init? : RequestInit ) => Promise <Response > | MDN Reference |
Returns
void
spanComponentsToObjectId
▸ spanComponentsToObjectId(«destructured»
): Promise
<string
>
Parameters
Name | Type |
---|---|
«destructured» | Object |
› components | SpanComponentsV3 |
› state? | BraintrustState |
Returns
Promise
<string
>
startSpan
▸ startSpan<IsAsyncFlush
>(args?
): Span
Lower-level alternative to traced
. This allows you to start a span yourself, and can be useful in situations
where you cannot use callbacks. However, spans started with startSpan
will not be marked as the "current span",
so currentSpan()
and traced()
will be no-ops. If you want to mark a span as current, use traced
instead.
See traced for full details.
Type parameters
Name | Type |
---|---|
IsAsyncFlush | extends boolean = false |
Parameters
Name | Type |
---|---|
args? | StartSpanArgs & AsyncFlushArg <IsAsyncFlush > & OptionalStateArg |
Returns
summarize
▸ summarize(options?
): Promise
<ExperimentSummary
>
Summarize the current experiment, including the scores (compared to the closest reference experiment) and metadata.
Parameters
Name | Type | Description |
---|---|---|
options | Object | Options for summarizing the experiment. |
options.comparisonExperimentId? | string | The experiment to compare against. If None, the most recent experiment on the origin's main branch will be used. |
options.summarizeScores? | boolean | Whether to summarize the scores. If False, only the metadata will be returned. |
Returns
Promise
<ExperimentSummary
>
A summary of the experiment, including the scores (compared to the closest reference experiment) and metadata.
traceable
▸ traceable<F
, IsAsyncFlush
>(fn
, args?
): IsAsyncFlush
extends false
? (...args
: Parameters
<F
>) => Promise
<Awaited
<ReturnType
<F
>>> : F
A synonym for wrapTraced
. If you're porting from systems that use traceable
, you can use this to
make your codebase more consistent.
Type parameters
Name | Type |
---|---|
F | extends (...args : any []) => any |
IsAsyncFlush | extends boolean = false |
Parameters
Name | Type |
---|---|
fn | F |
args? | StartSpanArgs & SetCurrentArg & AsyncFlushArg <IsAsyncFlush > |
Returns
IsAsyncFlush
extends false
? (...args
: Parameters
<F
>) => Promise
<Awaited
<ReturnType
<F
>>> : F
traced
▸ traced<IsAsyncFlush
, R
>(callback
, args?
): PromiseUnless
<IsAsyncFlush
, R
>
Toplevel function for starting a span. It checks the following (in precedence order):
- Currently-active span
- Currently-active experiment
- Currently-active logger
and creates a span under the first one that is active. Alternatively, if parent
is specified, it creates a span under the specified parent row. If none of these are active, it returns a no-op span object.
See Span.traced for full details.
Type parameters
Name | Type |
---|---|
IsAsyncFlush | extends boolean = false |
R | void |
Parameters
Name | Type |
---|---|
callback | (span : Span ) => R |
args? | StartSpanArgs & SetCurrentArg & AsyncFlushArg <IsAsyncFlush > & OptionalStateArg |
Returns
PromiseUnless
<IsAsyncFlush
, R
>
updateSpan
▸ updateSpan(«destructured»
): void
Update a span using the output of span.export()
. It is important that you only resume updating
to a span once the original span has been fully written and flushed, since otherwise updates to
the span may conflict with the original span.
Parameters
Name | Type |
---|---|
«destructured» | { exported : string } & Omit <Partial <ExperimentEvent >, "id" > & OptionalStateArg |
Returns
void
withCurrent
▸ withCurrent<R
>(span
, callback
, state?
): R
Runs the provided callback with the span as the current span.
Type parameters
Name |
---|
R |
Parameters
Name | Type | Default value |
---|---|---|
span | Span | undefined |
callback | (span : Span ) => R | undefined |
state | BraintrustState | _globalState |
Returns
R
withDataset
▸ withDataset<R
, IsLegacyDataset
>(project
, callback
, options?
): R
Type parameters
Name | Type |
---|---|
R | R |
IsLegacyDataset | extends boolean = false |
Parameters
Name | Type |
---|---|
project | string |
callback | (dataset : Dataset <IsLegacyDataset >) => R |
options | Readonly <InitDatasetOptions <IsLegacyDataset >> |
Returns
R
Deprecated
Use initDataset instead.
withExperiment
▸ withExperiment<R
>(project
, callback
, options?
): R
Type parameters
Name |
---|
R |
Parameters
Name | Type |
---|---|
project | string |
callback | (experiment : Experiment ) => R |
options | Readonly <LoginOptions & { forceLogin? : boolean } & { baseExperiment? : string ; baseExperimentId? : string ; dataset? : AnyDataset ; description? : string ; experiment? : string ; gitMetadataSettings? : { collect : "some" | "none" | "all" ; fields? : ("dirty" | "tag" | "commit" | "branch" | "author_name" | "author_email" | "commit_message" | "commit_time" | "git_diff" )[] } ; isPublic? : boolean ; metadata? : Record <string , unknown > ; projectId? : string ; repoInfo? : { author_email? : null | string ; author_name? : null | string ; branch? : null | string ; commit? : null | string ; commit_message? : null | string ; commit_time? : null | string ; dirty? : null | boolean ; git_diff? : null | string ; tag? : null | string } ; setCurrent? : boolean ; state? : BraintrustState ; update? : boolean } & InitOpenOption <false > & SetCurrentArg > |
Returns
R
Deprecated
Use init instead.
withLogger
▸ withLogger<IsAsyncFlush
, R
>(callback
, options?
): R
Type parameters
Name | Type |
---|---|
IsAsyncFlush | extends boolean = false |
R | void |
Parameters
Name | Type |
---|---|
callback | (logger : Logger <IsAsyncFlush >) => R |
options | Readonly <LoginOptions & { forceLogin? : boolean } & { projectId? : string ; projectName? : string ; setCurrent? : boolean ; state? : BraintrustState } & AsyncFlushArg <IsAsyncFlush > & SetCurrentArg > |
Returns
R
Deprecated
Use initLogger instead.
wrapAISDKModel
▸ wrapAISDKModel<T
>(model
): T
Wrap an ai-sdk model (created with .chat()
, .completion()
, etc.) to add tracing. If Braintrust is
not configured, this is a no-op
Type parameters
Name | Type |
---|---|
T | extends object |
Parameters
Name | Type |
---|---|
model | T |
Returns
T
The wrapped object.
wrapOpenAI
▸ wrapOpenAI<T
>(openai
): T
Wrap an OpenAI
object (created with new OpenAI(...)
) to add tracing. If Braintrust is
not configured, this is a no-op
Currently, this only supports the v4
API.
Type parameters
Name | Type |
---|---|
T | extends object |
Parameters
Name | Type |
---|---|
openai | T |
Returns
T
The wrapped OpenAI
object.
wrapOpenAIv4
▸ wrapOpenAIv4<T
>(openai
): T
Type parameters
Name | Type |
---|---|
T | extends OpenAILike |
Parameters
Name | Type |
---|---|
openai | T |
Returns
T
wrapTraced
▸ wrapTraced<F
, IsAsyncFlush
>(fn
, args?
): IsAsyncFlush
extends false
? (...args
: Parameters
<F
>) => Promise
<Awaited
<ReturnType
<F
>>> : F
Wrap a function with traced
, using the arguments as input
and return value as output
.
Any functions wrapped this way will automatically be traced, similar to the @traced
decorator
in Python. If you want to correctly propagate the function's name and define it in one go, then
you can do so like this:
Now, any calls to myFunc
will be traced, and the input and output will be logged automatically.
If tracing is inactive, i.e. there is no active logger or experiment, it's just a no-op.
Type parameters
Name | Type |
---|---|
F | extends (...args : any []) => any |
IsAsyncFlush | extends boolean = false |
Parameters
Name | Type | Description |
---|---|---|
fn | F | The function to wrap. |
args? | StartSpanArgs & SetCurrentArg & AsyncFlushArg <IsAsyncFlush > | Span-level arguments (e.g. a custom name or type) to pass to traced . |
Returns
IsAsyncFlush
extends false
? (...args
: Parameters
<F
>) => Promise
<Awaited
<ReturnType
<F
>>> : F
The wrapped function.
Type Aliases
AnyDataset
Ƭ AnyDataset: Dataset
<boolean
>
BaseExperiment
Ƭ BaseExperiment<Input
, Expected
, Metadata
>: Object
Type parameters
Name | Type |
---|---|
Input | Input |
Expected | Expected |
Metadata | extends BaseMetadata = DefaultMetadataType |
Type declaration
Name | Type |
---|---|
_phantom? | [Input , Expected , Metadata ] |
_type | "BaseExperiment" |
name? | string |
BaseMetadata
Ƭ BaseMetadata: Record
<string
, unknown
> | void
BraintrustStreamChunk
Ƭ BraintrustStreamChunk: z.infer
<typeof braintrustStreamChunkSchema
>
A chunk of data from a Braintrust stream. Each chunk type matches an SSE event type.
ChatPrompt
Ƭ ChatPrompt: Object
Type declaration
Name | Type |
---|---|
messages | OpenAIMessage [] |
tools? | any [] |
CommentEvent
Ƭ CommentEvent: IdField
& { _audit_metadata?
: Record
<string
, unknown
> ; _audit_source
: Source
; comment
: { text
: string
} ; created
: string
; origin
: { id
: string
} } & ParentExperimentIds
| ParentProjectLogIds
CompiledPrompt
Ƭ CompiledPrompt<Flavor
>: CompiledPromptParams
& { span_info?
: { metadata
: { prompt
: { id
: string
; project_id
: string
; variables
: Record
<string
, unknown
> ; version
: string
} } ; name?
: string
; spanAttributes?
: Record
<any
, any
> } } & Flavor
extends "chat"
? ChatPrompt
: Flavor
extends "completion"
? CompletionPrompt
: {}
Type parameters
Name | Type |
---|---|
Flavor | extends "chat" | "completion" |
CompiledPromptParams
Ƭ CompiledPromptParams: Omit
<NonNullable
<PromptData
["options"
]>["params"
], "use_cache"
> & { model
: NonNullable
<NonNullable
<PromptData
["options"
]>["model"
]> }
CompletionPrompt
Ƭ CompletionPrompt: Object
Type declaration
Name | Type |
---|---|
prompt | string |
CreateProjectOpts
Ƭ CreateProjectOpts: NameOrId
DatasetRecord
Ƭ DatasetRecord<IsLegacyDataset
>: IsLegacyDataset
extends true
? LegacyDatasetRecord
: NewDatasetRecord
Type parameters
Name | Type |
---|---|
IsLegacyDataset | extends boolean = typeof DEFAULT_IS_LEGACY_DATASET |
DefaultMetadataType
Ƭ DefaultMetadataType: void
DefaultPromptArgs
Ƭ DefaultPromptArgs: Partial
<CompiledPromptParams
& AnyModelParam
& ChatPrompt
& CompletionPrompt
>
EndSpanArgs
Ƭ EndSpanArgs: Object
Type declaration
Name | Type |
---|---|
endTime? | number |
EvalCase
Ƭ EvalCase<Input
, Expected
, Metadata
>: { _xact_id?
: TransactionId
; id?
: string
; input
: Input
; tags?
: string
[] } & Expected
extends void
? object
: { expected
: Expected
} & Metadata
extends void
? object
: { metadata
: Metadata
}
Type parameters
Name |
---|
Input |
Expected |
Metadata |
EvalResult
Ƭ EvalResult<Input
, Output
, Expected
, Metadata
>: EvalCase
<Input
, Expected
, Metadata
> & { error
: unknown
; output
: Output
; scores
: Record
<string
, number
| null
> }
Type parameters
Name | Type |
---|---|
Input | Input |
Output | Output |
Expected | Expected |
Metadata | extends BaseMetadata = DefaultMetadataType |
EvalScorer
Ƭ EvalScorer<Input
, Output
, Expected
, Metadata
>: (args
: EvalScorerArgs
<Input
, Output
, Expected
, Metadata
>) => OneOrMoreScores
| Promise
<OneOrMoreScores
>
Type parameters
Name | Type |
---|---|
Input | Input |
Output | Output |
Expected | Expected |
Metadata | extends BaseMetadata = DefaultMetadataType |
Type declaration
▸ (args
): OneOrMoreScores
| Promise
<OneOrMoreScores
>
Parameters
Name | Type |
---|---|
args | EvalScorerArgs <Input , Output , Expected , Metadata > |
Returns
OneOrMoreScores
| Promise
<OneOrMoreScores
>
EvalScorerArgs
Ƭ EvalScorerArgs<Input
, Output
, Expected
, Metadata
>: EvalCase
<Input
, Expected
, Metadata
> & { output
: Output
}
Type parameters
Name | Type |
---|---|
Input | Input |
Output | Output |
Expected | Expected |
Metadata | extends BaseMetadata = DefaultMetadataType |
EvalTask
Ƭ EvalTask<Input
, Output
>: (input
: Input
, hooks
: EvalHooks
) => Promise
<Output
> | (input
: Input
, hooks
: EvalHooks
) => Output
Type parameters
Name |
---|
Input |
Output |
EvaluatorDef
Ƭ EvaluatorDef<Input
, Output
, Expected
, Metadata
>: { evalName
: string
; projectName
: string
} & Evaluator
<Input
, Output
, Expected
, Metadata
>
Type parameters
Name | Type |
---|---|
Input | Input |
Output | Output |
Expected | Expected |
Metadata | extends BaseMetadata = DefaultMetadataType |
EvaluatorFile
Ƭ EvaluatorFile: Object
Type declaration
Name | Type |
---|---|
evaluators | { [evalName: string] : { evaluator : EvaluatorDef <unknown , unknown , unknown , BaseMetadata > ; reporter? : ReporterDef <unknown > | string }; } |
functions | CodeFunction <unknown , unknown , GenericFunction <unknown , unknown >>[] |
prompts | CodePrompt [] |
reporters | { [reporterName: string] : ReporterDef <unknown >; } |
ExperimentLogFullArgs
Ƭ ExperimentLogFullArgs: Partial
<Omit
<OtherExperimentLogFields
, "output"
| "scores"
>> & Required
<Pick
<OtherExperimentLogFields
, "output"
| "scores"
>> & Partial
<InputField
> & Partial
<IdField
>
ExperimentLogPartialArgs
Ƭ ExperimentLogPartialArgs: Partial
<OtherExperimentLogFields
> & Partial
<InputField
>
FullInitOptions
Ƭ FullInitOptions<IsOpen
>: { project?
: string
} & InitOptions
<IsOpen
>
Type parameters
Name | Type |
---|---|
IsOpen | extends boolean |
FullLoginOptions
Ƭ FullLoginOptions: LoginOptions
& { forceLogin?
: boolean
}
IdField
Ƭ IdField: Object
Type declaration
Name | Type |
---|---|
id | string |
InitOptions
Ƭ InitOptions<IsOpen
>: FullLoginOptions
& { baseExperiment?
: string
; baseExperimentId?
: string
; dataset?
: AnyDataset
; description?
: string
; experiment?
: string
; gitMetadataSettings?
: GitMetadataSettings
; isPublic?
: boolean
; metadata?
: Record
<string
, unknown
> ; projectId?
: string
; repoInfo?
: RepoInfo
; setCurrent?
: boolean
; state?
: BraintrustState
; update?
: boolean
} & InitOpenOption
<IsOpen
>
Type parameters
Name | Type |
---|---|
IsOpen | extends boolean |
InputField
Ƭ InputField: Object
Type declaration
Name | Type |
---|---|
input | unknown |
InvokeReturn
Ƭ InvokeReturn<Stream
, Output
>: Stream
extends true
? BraintrustStream
: Output
The return type of the invoke
function. Conditionally returns a BraintrustStream
if stream
is true, otherwise returns the output of the function using the Zod schema's
type if present.
Type parameters
Name | Type |
---|---|
Stream | extends boolean |
Output | Output |
LogCommentFullArgs
Ƭ LogCommentFullArgs: IdField
& { _audit_metadata?
: Record
<string
, unknown
> ; _audit_source
: Source
; comment
: { text
: string
} ; created
: string
; origin
: { id
: string
} } & ParentExperimentIds
| ParentProjectLogIds
LogFeedbackFullArgs
Ƭ LogFeedbackFullArgs: IdField
& Partial
<Omit
<OtherExperimentLogFields
, "output"
| "metrics"
| "datasetRecordId"
> & { comment
: string
; source
: Source
}>
OtherExperimentLogFields
Ƭ OtherExperimentLogFields: Object
Type declaration
Name | Type |
---|---|
_async_scoring_control | AsyncScoringControl |
_merge_paths | string [][] |
_skip_async_scoring | boolean |
datasetRecordId | string |
error | unknown |
expected | unknown |
metadata | Record <string , unknown > |
metrics | Record <string , unknown > |
origin | z.infer <typeof objectReferenceSchema > |
output | unknown |
scores | Record <string , number | null > |
tags | string [] |
PromiseUnless
Ƭ PromiseUnless<B
, R
>: B
extends true
? R
: Promise
<Awaited
<R
>>
Type parameters
Name |
---|
B |
R |
PromptOpts
Ƭ PromptOpts<HasId
, HasVersion
>: Partial
<Omit
<BaseFnOpts
, "name"
>> & { name
: string
} & HasId
extends true
? PromptId
: Partial
<PromptId
> & HasVersion
extends true
? PromptVersion
: Partial
<PromptVersion
> & PromptContents
& { model
: string
; noTrace?
: boolean
; params?
: ModelParams
; tools?
: (GenericCodeFunction
| SavedFunctionId
| ToolFunctionDefinition
)[] }
Type parameters
Name | Type |
---|---|
HasId | extends boolean |
HasVersion | extends boolean |
PromptRowWithId
Ƭ PromptRowWithId<HasId
, HasVersion
>: Omit
<PromptRow
, "log_id"
| "org_id"
| "project_id"
| "id"
| "_xact_id"
> & Partial
<Pick
<PromptRow
, "project_id"
>> & HasId
extends true
? Pick
<PromptRow
, "id"
> : Partial
<Pick
<PromptRow
, "id"
>> & HasVersion
extends true
? Pick
<PromptRow
, "_xact_id"
> : Partial
<Pick
<PromptRow
, "_xact_id"
>>
Type parameters
Name | Type |
---|---|
HasId | extends boolean = true |
HasVersion | extends boolean = true |
SerializedBraintrustState
Ƭ SerializedBraintrustState: z.infer
<typeof loginSchema
>
SetCurrentArg
Ƭ SetCurrentArg: Object
Type declaration
Name | Type |
---|---|
setCurrent? | boolean |
SpanContext
Ƭ SpanContext: Object
Type declaration
Name | Type |
---|---|
NOOP_SPAN | typeof NOOP_SPAN |
currentSpan | typeof currentSpan |
startSpan | typeof startSpan |
withCurrent | typeof withCurrent |
StartSpanArgs
Ƭ StartSpanArgs: Object
Type declaration
Name | Type |
---|---|
event? | StartSpanEventArgs |
name? | string |
parent? | string |
propagatedEvent? | StartSpanEventArgs |
spanAttributes? | Record <any , any > |
startTime? | number |
type? | SpanType |
ToolFunctionDefinition
Ƭ ToolFunctionDefinition: z.infer
<typeof toolFunctionDefinitionSchema
>
ToolOpts
Ƭ ToolOpts<Params
, Returns
, Fn
>: Partial
<BaseFnOpts
> & { handler
: Fn
} & Schema
<Params
, Returns
>
Type parameters
Name | Type |
---|---|
Params | Params |
Returns | Returns |
Fn | extends GenericFunction <Params , Returns > |
WithTransactionId
Ƭ WithTransactionId<R
>: R
& { _xact_id
: TransactionId
}
Type parameters
Name |
---|
R |
Variables
LEGACY_CACHED_HEADER
• Const
LEGACY_CACHED_HEADER: "x-cached"
NOOP_SPAN
• Const
NOOP_SPAN: NoopSpan
X_CACHED_HEADER
• Const
X_CACHED_HEADER: "x-bt-cached"
braintrustStreamChunkSchema
• Const
braintrustStreamChunkSchema: ZodUnion
<[ZodObject
<{ data
: ZodString
; type
: ZodLiteral
<"text_delta"
> }, "strip"
, ZodTypeAny
, { data
: string
; type
: "text_delta"
}, { data
: string
; type
: "text_delta"
}>, ZodObject
<{ data
: ZodString
; type
: ZodLiteral
<"json_delta"
> }, "strip"
, ZodTypeAny
, { data
: string
; type
: "json_delta"
}, { data
: string
; type
: "json_delta"
}>, ZodObject
<{ data
: ZodString
; type
: ZodLiteral
<"error"
> }, "strip"
, ZodTypeAny
, { data
: string
; type
: "error"
}, { data
: string
; type
: "error"
}>]>
projects
• Const
projects: ProjectBuilder
toolFunctionDefinitionSchema
• Const
toolFunctionDefinitionSchema: ZodObject
<{ function
: ZodObject
<{ description
: ZodOptional
<ZodString
> ; name
: ZodString
; parameters
: ZodOptional
<ZodRecord
<ZodString
, ZodUnknown
>> ; strict
: ZodOptional
<ZodBoolean
> }, "strip"
, ZodTypeAny
, { description?
: string
; name
: string
; parameters?
: Record
<string
, unknown
> ; strict?
: boolean
}, { description?
: string
; name
: string
; parameters?
: Record
<string
, unknown
> ; strict?
: boolean
}> ; type
: ZodLiteral
<"function"
> }, "strip"
, ZodTypeAny
, { function
: { name: string; description?: string | undefined; parameters?: Record<string, unknown> | undefined; strict?: boolean | undefined; } ; type
: "function"
}, { function
: { name: string; description?: string | undefined; parameters?: Record<string, unknown> | undefined; strict?: boolean | undefined; } ; type
: "function"
}>