openai/evals

Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.

[view on github]last commit: Apr 14, 2026
stars
18,208
7d
+51
30d
-
90d
-
## star history
## found in