⚖️Safety & PolicyAISI's research highlights the importance of accounting for test-time compute in AI agent evaluations, as fixed budgets can underestimate capabilities, especially for newer models. Increasing compute can improve performance, and the benefits are more significant for more advanced models.