Reference/SDK/TypeScript/Classes

Class: Experiment

An experiment is a collection of logged events, such as model inputs and outputs, which represent a snapshot of your application at a particular point in time. An experiment is meant to capture more than just the model you use, and includes the data you use to test, pre- and post- processing code, comparison metrics (scores), and any other metadata you want to include.

Experiments are associated with a project, and two experiments are meant to be easily comparable via their inputs. You can change the attributes of the experiments in a project (e.g. scoring functions) over time, simply by changing what you log.

You should not create Experiment objects directly. Instead, use the braintrust.init() method.

Hierarchy

  • ObjectFetcher<ExperimentEvent>

    Experiment

Implements

Accessors

id

get id(): Promise<string>

Returns

Promise<string>

Overrides

ObjectFetcher.id


name

get name(): Promise<string>

Returns

Promise<string>


project

get project(): Promise<ObjectMetadata>

Returns

Promise<ObjectMetadata>

Constructors

constructor

new Experiment(state, lazyMetadata, dataset?): Experiment

Parameters

NameType
stateBraintrustState
lazyMetadataLazyValue<ProjectExperimentMetadata>
dataset?AnyDataset

Returns

Experiment

Overrides

ObjectFetcher&lt;ExperimentEvent&gt;.constructor

Methods

[asyncIterator]

[asyncIterator](): AsyncIterator<WithTransactionId<ExperimentEvent>, any, undefined>

Returns

AsyncIterator<WithTransactionId<ExperimentEvent>, any, undefined>

Inherited from

ObjectFetcher.[asyncIterator]


clearCache

clearCache(): void

Returns

void

Inherited from

ObjectFetcher.clearCache


close

close(): Promise<string>

Returns

Promise<string>

Deprecated

This function is deprecated. You can simply remove it from your code.


export

export(): Promise<string>

Return a serialized representation of the experiment that can be used to start subspans in other places.

See Span.startSpan for more details.

Returns

Promise<string>

Implementation of

Exportable.export


fetch

fetch(): AsyncGenerator<WithTransactionId<ExperimentEvent>, any, unknown>

Returns

AsyncGenerator<WithTransactionId<ExperimentEvent>, any, unknown>

Inherited from

ObjectFetcher.fetch


fetchBaseExperiment

fetchBaseExperiment(): Promise<null | { id: any ; name: any }>

Returns

Promise<null | { id: any ; name: any }>


fetchedData

fetchedData(): Promise<WithTransactionId<ExperimentEvent>[]>

Returns

Promise<WithTransactionId<ExperimentEvent>[]>

Inherited from

ObjectFetcher.fetchedData


flush

flush(): Promise<void>

Flush any pending rows to the server.

Returns

Promise<void>


getState

getState(): Promise<BraintrustState>

Returns

Promise<BraintrustState>

Overrides

ObjectFetcher.getState


log

log(event, options?): string

Log a single event to the experiment. The event will be batched and uploaded behind the scenes.

Parameters

NameTypeDescription
eventReadonly<ExperimentLogFullArgs>The event to log.
options?ObjectAdditional logging options
options.allowConcurrentWithSpans?booleanin rare cases where you need to log at the top level separately from spans on the experiment elsewhere, set this to true.

Returns

string

The id of the logged event.


logFeedback

logFeedback(event): void

Log feedback to an event in the experiment. Feedback is used to save feedback scores, set an expected value, or add a comment.

Parameters

NameType
eventLogFeedbackFullArgs

Returns

void


startSpan

startSpan(args?): Span

Lower-level alternative to traced. This allows you to start a span yourself, and can be useful in situations where you cannot use callbacks. However, spans started with startSpan will not be marked as the "current span", so currentSpan() and traced() will be no-ops. If you want to mark a span as current, use traced instead.

See traced for full details.

Parameters

NameType
args?StartSpanArgs

Returns

Span


summarize

summarize(options?): Promise<ExperimentSummary>

Summarize the experiment, including the scores (compared to the closest reference experiment) and metadata.

Parameters

NameTypeDescription
optionsObjectOptions for summarizing the experiment.
options.comparisonExperimentId?stringThe experiment to compare against. If None, the most recent experiment on the origin's main branch will be used.
options.summarizeScores?booleanWhether to summarize the scores. If False, only the metadata will be returned.

Returns

Promise<ExperimentSummary>

A summary of the experiment, including the scores (compared to the closest reference experiment) and metadata.


traced

traced<R>(callback, args?): R

Create a new toplevel span underneath the experiment. The name defaults to "root".

See Span.traced for full details.

Type parameters

Name
R

Parameters

NameType
callback(span: Span) => R
args?StartSpanArgs & SetCurrentArg

Returns

R


updateSpan

updateSpan(event): void

Update a span in the experiment using its id. It is important that you only update a span once the original span has been fully written and flushed, since otherwise updates to the span may conflict with the original span.

Parameters

NameTypeDescription
eventOmit<Partial<ExperimentEvent>, "id"> & Required<Pick<ExperimentEvent, "id">>The event data to update the span with. Must include id. See Experiment.log for a full list of valid fields.

Returns

void


version

version(): Promise<undefined | string>

Returns

Promise<undefined | string>

Inherited from

ObjectFetcher.version

Properties

dataset

Optional Readonly dataset: AnyDataset


kind

kind: "experiment"