gpt-4-32k, is there a dashboard showing what evals are currently passing? #1138

unicomp21 · 2023-06-11T00:10:00Z

unicomp21
Jun 11, 2023

And which evals are currently failing? Maybe we could get a dashboard or something which gets updated once a week?

EMPERO-CYBER-HUB · 2023-06-22T20:53:45Z

EMPERO-CYBER-HUB
Jun 22, 2023

yes i agree

0 replies

unicomp21 · 2023-06-26T12:39:04Z

unicomp21
Jun 26, 2023
Author

@usama-openai could we get an eval run, along w/ log containing accuracy for each eval, at each model release? And put the log files in a repo? I would prefer to not burn my own cash to see where eval performance is currently at for each of the models/versions.

0 replies

unicomp21 · 2023-06-26T12:39:19Z

unicomp21
Jun 26, 2023
Author

@andrew-openai ^

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

gpt-4-32k, is there a dashboard showing what evals are currently passing? #1138

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

gpt-4-32k, is there a dashboard showing what evals are currently passing? #1138

Uh oh!

unicomp21 Jun 11, 2023

Replies: 3 comments

Uh oh!

EMPERO-CYBER-HUB Jun 22, 2023

Uh oh!

unicomp21 Jun 26, 2023 Author

Uh oh!

unicomp21 Jun 26, 2023 Author

unicomp21
Jun 11, 2023

EMPERO-CYBER-HUB
Jun 22, 2023

unicomp21
Jun 26, 2023
Author

unicomp21
Jun 26, 2023
Author