THUDM/AgentBench

A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)

[view on github]last commit: Feb 8, 2026

stars

3,505

7d

+14

30d

+68

90d

+254

## star history

## found in

Awesome Open Source AI/Benchmark Suites