Qwen3.6 Flash

qwen3.6-flash

Qwen3.6 Flash is the speed-first member of the Qwen3.6 family. Sources describe text, image, and video input support, prompt caching, and a 1M-token context window, which makes it unusual for a fast model. It should be positioned for high-volume multimodal and long-context workloads where response speed and cost efficiency are important.

Total Context

1Mtokens

Max Output

65.5Ktokens

Released

Apr 27, 2026

Modalities

Qwen3.6 Flash Price

Token TierInput PriceOutput PriceCache Create 5mCache Read 5m
<=256K$0.1714/M$1.0286/M$0.2143/M$0.0171/M
>256K$0.6857/M$4.1143/M$0.8571/M$0.0686/M

Qwen3.6 Flash API

POST /v1/chat/completions

Media and Discussions

Selected public videos and posts related to this model.

X (Twitter)

View post on X
View post on X
View post on X

Reddit

Qwen 3.6 Flash FAQ

Qwen 3.6 Flash: capabilities, use cases, limits, and TokenHub guidance.

How is Qwen 3.6 Flash positioned?+

Qwen 3.6 Flash is a Alibaba Qwen model for fast multimodal understanding and high-volume use.

Where does Qwen 3.6 Flash add value?+

Best for high-volume requests, image and video understanding and latency-sensitive applications, especially when throughput is the priority.

What is Qwen 3.6 Flash's practical edge?+

Key strength: fast multimodal responses with a broad feature set and hybrid thinking that can switch between deliberate and direct responses.

Which constraint matters most?+

It trades some peak quality for better speed or cost. For maximum answer quality, consider Qwen 3.7 Plus.

How do I integrate Qwen 3.6 Flash safely?+

Use the exact ID shown by TokenHub; follow your account docs and verify current features.