Type Alias RunEvalConfig<T, U>

RunEvalConfig<T, U>: {
    customEvaluators?: U[];
    evaluators?: RunEvalType<T, U>[];
    formatEvaluatorInputs?: EvaluatorInputFormatter;
}

Configuration class for running evaluations on datasets.

Type Parameters

Type declaration

  • OptionalcustomEvaluators?: U[]

    Custom evaluators to apply to a dataset run. Each evaluator is provided with a run trace containing the model outputs, as well as an "example" object representing a record in the dataset.

    Use evaluators instead.

  • Optionalevaluators?: RunEvalType<T, U>[]

    Evaluators to apply to a dataset run. You can optionally specify these by name, or by configuring them with an EvalConfig object.

  • OptionalformatEvaluatorInputs?: EvaluatorInputFormatter

    Convert the evaluation data into formats that can be used by the evaluator. This should most commonly be a string. Parameters are the raw input from the run, the raw output, raw reference output, and the raw run.

    // Chain input: { input: "some string" }
    // Chain output: { output: "some output" }
    // Reference example output format: { output: "some reference output" }
    const formatEvaluatorInputs = ({
    rawInput,
    rawPrediction,
    rawReferenceOutput,
    }) => {
    return {
    input: rawInput.input,
    prediction: rawPrediction.output,
    reference: rawReferenceOutput.output,
    };
    };

    The prepared data.

RunEvalConfig in LangSmith is a configuration class for running evaluations on datasets. Its primary purpose is to define the parameters and evaluators that will be applied during the evaluation of a dataset. This configuration can include various evaluators, custom evaluators, and different keys for inputs, predictions, and references.

T - The type of evaluators.

U - The type of custom evaluators.