judgment-cookbook

This repo contains cookbooks demonstrating evaluations of AI Agents using the judgeval package implemented by Judgment Labs.

Prerequisites

Before running these examples, make sure you have:

Installed the latest version of the Judgeval package:
```
pip install judgeval
```

Set up your Judgeval API key and organization ID as environment variables:

export JUDGMENT_API_KEY="your_api_key"
export JUDGMENT_ORG_ID="your_org_id"

To get your API key and Organization ID, make an account on the Judgment Labs platform.

Try Out	Notebook	Description
RL	Wikipedia Racer	Train agents with reinforcement learning
Online Monitoring	Research Agent	Monitor agent behavior in production
Custom Scorers	HumanEval	Build custom evaluators for your agents
Offline Testing	[Get Started For Free]	Compare how different prompts, models, or agent configs affect performance across ANY metric

Name		Name	Last commit message	Last commit date
Latest commit History 136 Commits
assets		assets
custom_scorers		custom_scorers
monitoring		monitoring
rl		rl
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock