Tag

#api-optimization

Every item tagged api-optimization, newest first.

2 items

Speeding up agentic workflows with WebSockets in the Responses API

OpenAI detailed how WebSockets and caching in their Responses API reduced latency and overhead in agentic workflows. The Codex agent loop was used as a testbed for these optimizations. Builders can apply similar techniques to improve performance in their own applications. This approach may help reduce costs and improve responsiveness.

Key takeaways

WebSockets reduced API overhead in agentic workflows.
Connection-scoped caching improved model latency.
Techniques tested in Codex agent loop applicable to other apps.

OOpenAI#api-optimization #agentic-workflows #websocket

modelsMar 17

Introducing GPT-5.4 mini and nano

OpenAI introduced GPT-5.4 mini and nano, smaller and faster versions of GPT-5.4 optimized for coding, tool use, and multimodal reasoning. These models target high-volume API and sub-agent workloads. You can use them for applications requiring low latency and high throughput. The new models are part of OpenAI's strategy to offer a range of models for different use cases.

Key takeaways

GPT-5.4 mini and nano are optimized for coding and tool use.
Targeted at high-volume API and sub-agent workloads.
Offer low latency and high throughput for applications.

OOpenAI#multimodal-models #api-optimization #model-variants