qwen

Qwen: Qwen3 235B A22B Instruct 2507

Qwen3-235B-A22B-Instruct-2507 is a multilingual, instruction-tuned mixture-of-experts language model based on the Qwen3-235B architecture, with 22B active parameters per forward pass. It is optimized for general-purpose text generation, including instruction following, logical reasoning, math, code, and tool usage. The model supports a native 262K context length and does not implement "thinking mode" (<think> blocks). Compared to its base variant, this version delivers significant gains in knowledge coverage, long-context reasoning, coding benchmarks, and alignment with open-ended tasks. It is particularly strong on multilingual understanding, math reasoning (e.g., AIME, HMMT), and alignment evaluations like Arena-Hard and WritingBench.

Input Cost
$0.07
per 1M tokens
Output Cost
$0.10
per 1M tokens
Context Window
262,144
tokens
Compare vs GPT-4o
Developer ID: qwen/qwen3-235b-a22b-2507

Related Models

qwen
$0.03/1M

Qwen: Qwen-Turbo

Qwen-Turbo, based on Qwen2.5, is a 1M context model that provides fast speed and low cost,...

📝 131,072 ctx Compare →
qwen
$0.09/1M

Qwen: Qwen3 30B A3B Instruct 2507

Qwen3-30B-A3B-Instruct-2507 is a 30.5B-parameter mixture-of-experts language model from Qw...

📝 262,144 ctx Compare →
qwen
$0.15/1M

Qwen: QwQ 32B

QwQ is the reasoning model of the Qwen series. Compared with conventional instruction-tune...

📝 32,768 ctx Compare →