openai/evals

Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.

[view on github]last commit: Apr 14, 2026
stars
18,524
7d
+33
30d
+249
90d
+658
## star history
## found in