Interface: Evaluator<Input, Output, Expected, Metadata>
Type parameters
Name | Type |
---|---|
Input | Input |
Output | Output |
Expected | Expected |
Metadata | extends BaseMetadata = DefaultMetadataType |
Properties
baseExperimentId
• Optional
baseExperimentId: string
An optional experiment id to use as a base. If specified, the new experiment will be summarized
and compared to this experiment. This takes precedence over baseExperimentName
if specified.
baseExperimentName
• Optional
baseExperimentName: string
An optional experiment name to use as a base. If specified, the new experiment will be summarized and compared to this experiment.
data
• data: EvalData
<Input
, Expected
, Metadata
>
A function that returns a list of inputs, expected outputs, and metadata.
experimentName
• Optional
experimentName: string
An optional name for the experiment.
gitMetadataSettings
• Optional
gitMetadataSettings: Object
Optional settings for collecting git metadata. By default, will collect all git metadata fields allowed in org-level settings.
Type declaration
Name | Type |
---|---|
collect | "some" | "none" | "all" |
fields? | ("dirty" | "tag" | "commit" | "branch" | "author_name" | "author_email" | "commit_message" | "commit_time" | "git_diff" )[] |
isPublic
• Optional
isPublic: boolean
Whether the experiment should be public. Defaults to false.
maxConcurrency
• Optional
maxConcurrency: number
The maximum number of tasks/scorers that will be run concurrently. Defaults to undefined, in which case there is no max concurrency.
metadata
• Optional
metadata: Record
<string
, unknown
>
Optional additional metadata for the experiment.
projectId
• Optional
projectId: string
If specified, uses the given project ID instead of the evaluator's name to identify the project.
repoInfo
• Optional
repoInfo: Object
Optionally explicitly specify the git metadata for this experiment. This takes precedence over gitMetadataSettings
if specified.
Type declaration
Name | Type |
---|---|
author_email? | null | string |
author_name? | null | string |
branch? | null | string |
commit? | null | string |
commit_message? | null | string |
commit_time? | null | string |
dirty? | null | boolean |
git_diff? | null | string |
tag? | null | string |
scores
• scores: EvalScorer
<Input
, Output
, Expected
, Metadata
>[]
A set of functions that take an input, output, and expected value and return a score.
state
• Optional
state: BraintrustState
If specified, uses the logger state to initialize Braintrust objects. If unspecified, falls back to the global state (initialized using your API key).
task
• task: EvalTask
<Input
, Output
>
A function that takes an input and returns an output.
timeout
• Optional
timeout: number
The duration, in milliseconds, after which to time out the evaluation. Defaults to undefined, in which case there is no timeout.
trialCount
• Optional
trialCount: number
The number of times to run the evaluator per input. This is useful for evaluating applications that have non-deterministic behavior and gives you both a stronger aggregate measure and a sense of the variance in the results.
update
• Optional
update: boolean
Whether to update an existing experiment with experiment_name
if one exists. Defaults to false.