Cost Optimization Guide

How to combine tools to maximize quality and minimize cost — savings of 60-70% on average.

Cost Architecture

Orchestrator (expensive)

Claude Code

Claude Code — Sonnet/Opus via subscription. Front-end, UI, creativity, architectural decisions.

Review (fixed cost)

Codex

Codex — GPT-5-Codex via subscription. Code review, adversarial review, background rescue.

Execution (cheap/free)

OpenRouter

OpenRouter (Qwen/GLM) or Ollama (local). Deterministic tasks, boilerplate, refactoring.

Principles

  • 1

    Claude Code is the orchestrator — use where it is irreplaceable: UI, UX, creativity, complex decisions

  • 2

    Codex is the reviewer — use for code review, adversarial review, and task delegation

  • 3

    OpenRouter is the cheap executor — use Qwen 3.6 or GLM 5.1 for deterministic tasks

  • 4

    OpenCode + Ollama is the free executor — use when you want absolute zero cost

Step-by-Step Setup

1. OpenRouter as cheap provider

Get API key at openrouter.ai and configure in .codex/config.toml with model_providers.openrouter.

[model_providers.openrouter]
name = "OpenRouter"
base_url = "https://openrouter.ai/api/v1"
env_key = "OPENROUTER_API_KEY"
wire_api = "responses"

2. Custom agent with cheap model

Create .codex/agents/qwen-worker.toml with model = "qwen/qwen-3.6-plus" and model_provider = "openrouter".

name = "qwen-worker"
description = "Executor de baixo custo para tarefas determinísticas."
model = "qwen/qwen-3.6-plus"
model_provider = "openrouter"
sandbox_mode = "workspace-write"