Skip to content

Conversation

MatsErdkamp
Copy link

@MatsErdkamp MatsErdkamp commented Oct 2, 2025

Draft PR aims to solve this issue. Allows subscores to be defined in metrics. Declared subscores can be seen by optimizers. An upstream issue that requires this change can be found in the GEPA repo

Example code

    def compute_overall_score(gold, pred, trace, pred_name=None, pred_trace=None):
        metrics = compute_metrics(gold, pred, trace, pred_name, pred_trace)
        quality = subscore("quality", metrics.quality)
        leakage = subscore("leakage", 1.0 - metrics.leakage)

        return (quality + leakage) / 2.0

In the future it would be nice to also adapt MLflow.dspy to autolog the subscores as evaluations.

Would love to have discussions on this draft PR about the implementation. There are some quite big changes that warrant some discussion over syntax, naming and implementation

…e-metrics-support

Add multi-objective metric support with subscores
…d clarity and functionality

This update modifies the metric handling throughout the codebase, transitioning from the Scores class to the new Score class. The Score class encapsulates scalar values and subscores, enhancing the metric evaluation process. Adjustments were made in various modules, including evaluation, metrics, and teleprompt utilities, to ensure compatibility with the new structure. Additionally, documentation and tests were updated to reflect these changes.
…e-metrics-support

Refactor metric handling: Replace Scores with Score class for improve…
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant