★ 8/10 · Ai · 2026-04-23

A pelican for GPT-5.5 via the semi-official Codex backdoor API

GPT-5.5 has been released for ChatGPT subscribers and OpenAI Codex, though the official OpenAI API deployment is currently pending due to ongoing safety and security scaling requirements. Developers can access the model...

A pelican for GPT-5.5 via the semi-official Codex backdoor API

Summary

GPT-5.5 has been released for ChatGPT subscribers and OpenAI Codex, though the official OpenAI API deployment is currently pending due to ongoing safety and security scaling requirements. Developers can access the model via the /backend-api/codex/responses endpoint using the llm-openai-via-codex plugin, which leverages existing Codex subscriptions.

Key Points

  • GPT-5.5 pricing is set at $5 per 1M input tokens and $30 per 1M output tokens, representing a 2x increase over GPT-5.4.
  • GPT-5.5 Pro is priced at $30 per 1M input tokens and $180 per 1M output tokens.
  • The llm-openai-via-codex plugin enables the llm library to interface with the Codex-specific API using existing authentication tokens.
  • The model supports a reasoning_effort parameter; setting this to xhigh can significantly increase complexity, such as utilizing 9,322 reasoning tokens for SVG generation compared to 39 tokens in default mode.
  • The official OpenAI API for GPT-5.5 and GPT-5.5 Pro is not yet available for deployment.

Technical Details

The llm-openai-via-codex plugin functions by reverse-engineering the openai/codex repository to identify how authentication tokens are stored and utilized. By targeting the /backend-api/codex/responses endpoint, the plugin allows the llm CLI to route prompts through the Codex infrastructure, effectively using the Codex subscription as a proxy for an API. This implementation supports standard llm features, including image attachments via the -a flag, ongoing chat sessions via llm chat, and tool support.

Performance varies significantly based on the reasoning_effort configuration. High-effort reasoning (xhigh) utilizes a much higher volume of reasoning tokens to produce more complex, CSS-heavy outputs, though this results in significantly higher latency, sometimes exceeding four minutes for a single generation.

Impact / Why It Matters

This workaround provides developers with programmatic access to GPT-5.5's capabilities for benchmarking and agentic workflows before the official OpenAI API release. It allows for the integration of the latest model into existing CLI-based pipelines and agent harnesses like OpenClaw and Pi.

ai openai llm