Back Original

Speed up responses with fast mode

Fast mode delivers faster Opus 4.6 responses at a higher cost per token. Toggle it on with /fast when you need speed for interactive work like rapid iteration or live debugging, and toggle it off when cost matters more than latency. Fast mode is not a different model. It uses the same Opus 4.6 with a different API configuration that prioritizes speed over cost efficiency. You get identical quality and capabilities, just faster responses. What to know:

This page covers how to toggle fast mode, its cost tradeoff, when to use it, requirements, and rate limit behavior.

Toggle fast mode

Toggle fast mode in either of these ways:

Fast mode persists across sessions. For the best cost efficiency, enable fast mode at the start of a session rather than switching mid-conversation. See understand the cost tradeoff for details. When you enable fast mode:

When you disable fast mode with /fast again, you remain on Opus 4.6. The model does not revert to your previous model. To switch to a different model, use /model.

Understand the cost tradeoff

Fast mode has higher per-token pricing than standard Opus 4.6:

ModeInput (MTok)Output (MTok)
Fast mode on Opus 4.6 (<200K)$30$150
Fast mode on Opus 4.6 (>200K)$60$225

Fast mode is compatible with the 1M token extended context window. When you switch into fast mode mid-conversation, you pay the full fast mode uncached input token price for the entire conversation context. This costs more than if you had enabled fast mode from the start.

Decide when to use fast mode

Fast mode is best for interactive work where response latency matters more than cost:

Standard mode is better for:

Fast mode vs effort level

Fast mode and effort level both affect response speed, but differently:

SettingEffect
Fast modeSame model quality, lower latency, higher cost
Lower effort levelLess thinking time, faster responses, potentially lower quality on complex tasks

You can combine both: use fast mode with a lower effort level for maximum speed on straightforward tasks.

Requirements

Fast mode requires all of the following:

Enable fast mode for your organization

Admins can enable fast mode in:

Handle rate limits

Fast mode has separate rate limits from standard Opus 4.6. When you hit the fast mode rate limit or run out of extra usage credits:

  1. Fast mode automatically falls back to standard Opus 4.6
  2. The icon turns gray to indicate cooldown
  3. You continue working at standard speed and pricing
  4. When the cooldown expires, fast mode automatically re-enables

To disable fast mode manually instead of waiting for cooldown, run /fast again.

Research preview

Fast mode is a research preview feature. This means:

Report issues or feedback through your usual Anthropic support channels.

See also