UK AI Safety Institute
Visit site ↗You’re all caught up.
.png)
AISI's research highlights the importance of accounting for test-time compute in AI agent evaluations, as fixed budgets can underestimate capabilities, especially for newer models. Increasing compute can improve performance, and the benefits are more significant for more advanced models.