openai/evals

Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.

[view on github]last commit: Apr 14, 2026

stars

18,725

7d

+48

30d

+230

90d

+714

## star history

## found in

Awesome Open Source AI/Evaluation Frameworks