Skip to main content
Each offline evaluation creates an experiment, a permanent record of how the evaluated task performed on a dataset.

View results

To view the results of an experiment, go to Experiments in your project and select the experiment from the list.
  • Traces vs. spans - By default, experiments display as a table of traces where each row represents a complete trace with its root span. To view the individual spans in traces instead, select Display > Row type > Spans. View individual spans when you want to:
    • Analyze specific operations within traces
    • Find particular function calls or API requests
    • Examine timing and token usage for individual operations
    Spans view is optimized for analyzing individual operations. Experiment comparisons and diff mode are only available when viewing traces.
  • Metrics - Along with the scores you track, Braintrust tracks a number of metrics about your LLM calls that help you assess and understand performance. For example, if you’re trying to figure out why the average duration increased substantially when you change a model, it’s useful to look at both duration and token metrics to diagnose the underlying issue. To compute LLM metrics like token counts, make sure you wrap your LLM calls.
  • Experiment summary - Select Details to view:
    • Comparisons to other experiments
    • Scorers used in the evaluation
    • Datasets tested
    • Metadata like model and parameters
    Copy the experiment ID from the bottom of the summary pane for referencing in code or sharing with teammates.

Filter results

Each project provides default table views with common filters for experiments, including:
  • Default view: Shows all traces in the experiment
  • Non-errors: Shows only traces without errors
  • Errors: Shows only traces with errors
  • Scorer errors: Show only traces with scorer errors
  • Unreviewed: Hides traces that have been human-reviewed
  • Assigned to me: Shows only traces assigned to the current user for human review
Use the menu to switch the table view. You can also use the Filter menu to add custom filtering. Use the Basic tab for point-and-click filtering, or switch to SQL to write precise SQL queries.
Default table views cannot be modified, but you can create custom table views based on custom filters and display settings.

Group results

Select Display > Group by to group the table by metadata fields to see patterns. By default, group rows show one experiment’s summary data. To view summary data for all experiments, select Include comparisons in group.

Order by regressions

Score and metric columns show summary statistics in their headers. To order columns by regressions, select Display > Columns > Order by regressions. Within grouped tables, this sorts rows by regressions of a specific score relative to a comparison experiment.

Examine individual traces

Select any row to open the trace view and see complete details:
  • Input, output, and expected values
  • Metadata and parameters
  • All spans in the trace hierarchy
  • Scores and their explanations
  • Timing and token usage
Ask yourself: Do good scores correspond to good outputs? If not, update your scorers or test cases. Use the button to expand the trace to fullscreen or the button to open it in a separate page.

View as a timeline

While viewing a trace, select Timeline to visualize the trace as a gantt chart. This view shows spans as horizontal bars where the width represents duration. Bars are color-coded by span type, making it easy to identify performance bottlenecks and understand the execution flow.

View as a conversation

While viewing a trace, select Thread to view the trace as a conversation thread. This view displays messages, tool calls, and scores in chronological order, ideal for debugging LLM conversations and multi-turn interactions. Use Find or press Cmd/Ctrl+F to search within the thread view and quickly locate specific content such as message text and score rationale. Matches are highlighted in-place using your browser’s native highlighting.
Thread view searches only within the currently open trace, not across all traces in your project.

Create custom trace views

While viewing a trace, select Views to create custom visualizations using natural language. Describe how you want to view your trace data and Loop will generate the code. For example:
  • “Create a view that renders a list of all tools available in this trace and their outputs”
  • “Render the video url from the trace’s metadata field and show simple thumbs up/down buttons”
By default, a custom trace view is only visible and editable by the user who created it. To share your view with all users in the project, select Save > Save as new view version > Update. See Create custom trace views for detailed examples, API reference, and how to embed views in your own applications.
Self-hosted deployments: If you restrict outbound access, allowlist https://www.braintrustsandbox.dev to enable custom views. This domain hosts the sandboxed iframe that securely renders custom view code.

Quick API reference

Your React component receives these props: Props: trace (object), span (object), update (function), selectSpan (function) Key trace properties: rootSpanId, selectedSpanId, spanOrder, spans, fetchSpanFields Accessing span data: By default, only the selected span has full data. To access data from other spans, use fetchSpanFields:
// Fetch specific fields for multiple spans
const data = await trace.fetchSpanFields(trace.spanOrder, ['input', 'output']);
See Edit trace view React code for complete documentation and examples.

Change span data format

When viewing a trace, each span field (input, output, metadata, etc.) displays data in a specific format. Change how a field displays by selecting the view mode dropdown in the field’s header. Available views:
  • Pretty - Parses objects deeply and renders values as Markdown (optimized for readability)
  • JSON - JSON highlighting and folding
  • YAML - YAML highlighting and folding
  • Tree - Hierarchical tree view for nested data structures
Additional format-specific views appear automatically for certain data types:
  • LLM - Formatted AI messages and tool calls with Markdown
  • LLM Raw - Unformatted AI messages and tool calls
  • HTML - Rendered HTML content
Your view mode selection is remembered per field type. To set a default view mode for all fields, go to Settings > Personal > Profile and select your preferred data view. See Personal settings for more details.

View raw trace data

When viewing a trace, select a span and then select the button in the span’s header to view the complete JSON representation. The raw data view shows all fields including metadata, inputs, outputs, and internal properties that may not be visible in other views. The raw data view has two tabs:
  • This span - Shows the complete JSON for the selected span only
  • Full trace - Shows the complete JSON for the entire trace
Use the search bar at the top of the dialog to find specific content within the data. Raw span data is useful when you need to:
  • Inspect the complete span structure for debugging
  • Find specific fields in large or deeply nested spans
  • Verify exact values and data types
  • Export or copy the full span for reproduction

Score retrospectively

Apply scorers to existing experiments:
  • Multiple cases: Select rows and use Score to apply chosen scorers
  • Single case: Open a trace and use Score in the trace view
Scores appear as additional spans within the trace.

Use aggregate scores

Aggregate scores are formulas that combine multiple scores into a single metric. They are useful when you track many scores but need a single metric to represent overall experiment quality. See Create aggregate scores for more details.

Download results

To download an experiment’s results, select and then Download as CSV or Download as JSON.

Change the display

Show and hide columns

Select Display > Columns and then:
  • Show or hide columns to focus on relevant data
  • Reorder columns by dragging them
  • Pin important columns to the left
All column settings are automatically saved when you save a view.

Create custom columns

Extract specific values from traces using custom columns:
  1. Select Display > Columns > + Add custom column.
  2. Name your column.
  3. Choose from inferred fields or write a SQL expression.
Once created, filter and sort using your custom columns.

Create custom table views

Custom table views save your table configurations including filters, column order, column visibility, and display settings. This lets you quickly switch between different ways of viewing your experiment results. To create a custom table view:
  1. Apply the filters and display settings you want.
  2. Select Save as in the toolbar.
  3. Enter a view name.
Custom table views are accessible and configurable by any member of the organization. Table views update dynamically with new rows matching saved criteria.

Adjust table layout

To change the table density to see more or less detail per row, select Display > Row height > Compact or Tall. To switch between different layouts, select Display > Layout and one of the following:
  • List: Default table view.
  • Grid: Compare outputs side-by-side.
  • Summary: Large-type summary of scores and metrics across all experiments.
Layouts respect view filters and are automatically saved when you save a view.

Next steps