Launch an eval
Launch an evaluation. This is the API-equivalent of the Eval
function that is built into the Braintrust SDK. In the Eval API, you provide pointers to a dataset, task function, and scoring functions. The API will then run the evaluation, create an experiment, and return the results along with a link to the experiment. To learn more about evals, see the Evals guide.
Authorization
Authorization
RequiredBearer <token>
Most Braintrust endpoints are authenticated by providing your API key as a header Authorization: Bearer [api_key]
to your HTTP request. You can create an API key in the Braintrust organization settings page.
In: header
Request Body
Eval launch parameters
project_id
Requiredstring
Unique identifier for the project to run the eval in
data
RequiredAny properties in dataset_id, project_dataset_name
The dataset to use
task
RequiredAny properties in function_id, project_slug, global_function, prompt_session_id, inline_code, inline_prompt
The function to evaluate
scores
Requiredarray<Any properties in function_id, project_slug, global_function, prompt_session_id, inline_code, inline_prompt & unknown>
The functions to score the eval on
experiment_name
string
An optional name for the experiment created by this eval. If it conflicts with an existing experiment, it will be suffixed with a unique identifier.
metadata
object
Optional experiment-level metadata to store about the evaluation. You can later use this to slice & dice across experiments.
stream
boolean
Whether to stream the results of the eval. If true, the request will return two events: one to indicate the experiment has started, and another upon completion. If false, the request will return the evaluation's summary upon completion.
Status code | Description |
---|---|
200 | Eval launch response |
Summary of an experiment