Qwen 3 Coder 30B A3B Instruct

alibaba/qwen3-coder-30b-a3b

Qwen 3 Coder 30B A3B Instruct is a compact mixture-of-experts coding model from Alibaba, activating only 3 billion parameters per inference while delivering strong agentic coding performance for cost-sensitive deployments.

ReasoningTool Use

import { streamText } from 'ai'

const result = streamText({
  model: 'alibaba/qwen3-coder-30b-a3b',
  prompt: 'Why is the sky blue?'
})

Playground

Try out Qwen 3 Coder 30B A3B Instruct by Alibaba. Usage is billed to your team at API rates. Free users get $5 of credits every 30 days, and you are considered a free user if you haven't made a payment.

About Qwen 3 Coder 30B A3B Instruct

Qwen 3 Coder 30B A3B Instruct is the accessible tier of the Qwen3-Coder family. The "30B-A3B" naming convention is explicit: 30 billion total parameters in the MoE architecture, with 3 billion activated during inference. That 10:1 ratio between stored and active capacity is the model's defining characteristic: broad capacity at 30B scale combined with 3B-equivalent serving costs.

Like its larger sibling, Qwen 3 Coder 30B A3B Instruct was developed within the Qwen3-Coder framework, carrying the same coding-first orientation: deep familiarity with programming languages, patterns, and developer workflows, paired with tuning for real-world coding tasks rather than just benchmark patterns.

The 3B active parameter count translates to meaningfully faster inference than dense models of comparable quality, which matters for interactive development tools where the model is invoked frequently.

Agentic capabilities (multi-turn tool use, plan-execute-debug iteration, and environment interaction) are present in this variant given its origin in the Qwen3-Coder lineage. Teams building coding assistants, automated review pipelines, or developer-facing products can use Qwen 3 Coder 30B A3B Instruct as a cost-effective foundation without sacrificing the core agentic orientation that distinguishes the Qwen3-Coder family from general-purpose models.

Providers

The AI Gateway supports routing requests across multiple AI providers. You can control provider preferences using the provider slugs available for copying with the buttons below. For more see the AI Gateway provider options documentation. By using the AI provider you acknowledge you reviewed and agree to their terms listed in the Legal section under the AI provider's name.

Provider

Context	Max Output	Latency	Throughput	Input	Output	Cache	Image Gen	Video Gen	Web Search	Per Query	Capabilities	ZDR	No Training	HIPAA	Release Date

Legal:Terms

•

Privacy

262K

0.3s

$0.15/M

$0.60/M

—

04/01/2025

Legal:Terms

•

Privacy

160K

33K

1.1s

128tps

$0.07/M

$0.27/M

—

04/01/2025

Metrics

Based exclusively on usage through AI Gateway.

Throughput24 hours

More models by Alibaba

Model

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	Providers	ZDR	No Training	Release Date

256K

4.7s

63tps

$0.60/M

$3.60/M

—

04/22/2026

0.7s

55tps

$0.50/M

$3.00/M

Read:$0.1/M

Write:$0.63/M

—

04/02/2026

1.0s

219tps

$0.10/M

$0.40/M

Read:$0.0/M

Write:$0.13/M

—

02/24/2026

1.6s

93tps

$0.40/M

$2.40/M

Read:

$0.04/M

Write:

$0.5/M

—

02/16/2026

256K

0.6s

99tps

$0.50/M

$1.20/M

—

07/22/2025

262K

0.3s

99tps

$0.30/M

$1.60/M

Read:$0.02/M

Write:—

—

04/01/2025

What To Consider When Choosing a Provider

Zero Data Retention
AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.
Authentication
AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

The 30B total / 3B active parameter structure keeps serving costs tractable, worth factoring in when you're comparing tiers within the Qwen3-Coder family.

When to Use Qwen 3 Coder 30B A3B Instruct

Best For

Cost-sensitive agentic coding deployments:
When you need a model that understands code at a meaningful depth and can handle multi-step workflows, but the per-token cost of the 480B-A35B variant isn't justified by your use case or volume, the 30B-A3B offers a practical alternative
Interactive coding tools with latency requirements:
The 3B active parameter count yields faster token generation than larger dense or MoE models. For coding assistants embedded in editors or IDEs where response time affects user experience, this matters
High-frequency automated code tasks:
CI/CD pipelines, automated PR description generation, code review summarization, and similar high-volume tasks are served well by a capable but economical model

Consider Alternatives When

The task requires the highest coding capability:
For the most complex repository-level engineering problems, multi-file refactors with subtle dependencies, or tasks where getting it right the first time is critical, the larger Qwen3-Coder variant offers a higher performance ceiling
General knowledge and reasoning matter as much as code:
This model is optimized for coding scenarios. Tasks that blend heavy general-domain reasoning with code may perform better on a general-purpose Qwen3 model of equivalent or larger size
Extremely long context is required:
Verify the context window (262.1K tokens) against your specific use case, particularly for agentic tasks that accumulate long tool-call histories

Conclusion

Qwen 3 Coder 30B A3B Instruct carves out the practical middle ground in agentic coding: enough code intelligence and multi-step reasoning for real software engineering tasks, at inference costs that make high-volume deployment financially viable. Through AI Gateway, the operational complexity of managing multiple provider relationships collapses into a single endpoint with built-in reliability.

FAQ

Both belong to the Qwen3-Coder family and share the same coding-first orientation. The 30B-A3B activates 3B parameters per inference versus 35B for the 480B-A35B model. The tradeoff is lower peak capability in exchange for lower serving cost and latency.

"A3B" stands for 3 billion activated parameters. In the mixture-of-experts architecture, each inference step routes through a subset of the total parameter space. The model stores 30 billion parameters but computes with only 3 billion per forward pass.

Qwen 3 Coder 30B A3B Instruct is specifically from the coding-specialized line in the Qwen3-Coder family. The general Qwen3-30B-A3B targets broader task coverage. The coder variant will generally outperform the general variant on coding-specific evaluations.

The model covers common programming languages and developer tooling. Specific language coverage details are in the Qwen3-Coder technical documentation at https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html.

Yes. Qwen 3 Coder 30B A3B Instruct inherits the agentic coding orientation of the Qwen3-Coder family, including tool-calling support and the ability to operate in plan-execute-debug loops. The context window (262.1K tokens) determines how much code and conversation history fits in a single session.

With 3B active parameters, the per-token compute cost is equivalent to a 3B dense model, which is substantially faster than a dense 30B model serving the same traffic. For throughput-sensitive applications, this translates to more requests served per unit of compute.

The Qwen3-Coder family is released as open models. Check https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html for licensing terms and model cards.

AI Cloud

Core Platform

Security

Company

Learn

Open Source

Use Cases

Tools

Users

Qwen 3 Coder 30B A3B Instruct

Playground

About Qwen 3 Coder 30B A3B Instruct

Providers

More models by Alibaba

What To Consider When Choosing a Provider

Zero Data Retention

Authentication

When to Use Qwen 3 Coder 30B A3B Instruct

Best For

Cost-sensitive agentic coding deployments:

Interactive coding tools with latency requirements:

High-frequency automated code tasks:

Consider Alternatives When

The task requires the highest coding capability:

General knowledge and reasoning matter as much as code:

Extremely long context is required:

Conclusion

FAQ

Playground

About Qwen 3 Coder 30B A3B Instruct

Providers

More models by Alibaba

About Qwen 3 Coder 30B A3B Instruct

What To Consider When Choosing a Provider

Zero Data Retention

Authentication

When to Use Qwen 3 Coder 30B A3B Instruct

Best For

Cost-sensitive agentic coding deployments:

Interactive coding tools with latency requirements:

High-frequency automated code tasks:

Consider Alternatives When

The task requires the highest coding capability:

General knowledge and reasoning matter as much as code:

Extremely long context is required:

Conclusion

FAQ

What is the relationship between Qwen 3 Coder 30B A3B Instruct and the 480B-A35B variant?

What does the "A3B" suffix indicate?

How is Qwen 3 Coder 30B A3B Instruct different from the general Qwen3-30B-A3B?

What programming languages and frameworks does Qwen 3 Coder 30B A3B Instruct cover?

Can I use Qwen 3 Coder 30B A3B Instruct for multi-file codebases and agentic sessions?

How does the MoE architecture affect throughput compared to a dense model?

Is Qwen 3 Coder 30B A3B Instruct open source?

About Qwen 3 Coder 30B A3B Instruct