DeepSeek V4 Flash

deepseek-v4-flash

DeepSeek V4 Flash keeps the V4 family’s 1M-token context window but uses a lighter MoE configuration, commonly described as 284B total parameters with 13B activated parameters. The emphasis is throughput: fast inference, lower cost per call, and production workloads that still need long-context handling. It is the better fit when the task volume is high and the workload benefits from V4-style long-context architecture without always requiring the deepest reasoning tier.

Total Context

1Mtokens

Max Output

384Ktokens

Released

Apr 24, 2026

Modalities

DeepSeek V4 Flash Price

Input PriceOutput PriceCache Read
$0.15/M$0.3/M$0.003/M

DeepSeek V4 Flash API

openaiPOST /v1/chat/completions

DeepSeek V4 Flash Benchmark

40.3

/100

Artificial Analysis Intelligence Index

Artificial Analysis broad capability aggregate

Index score

38.7

/100

Artificial Analysis Coding Index

Artificial Analysis software task aggregate

Index score

Knowledge & Reasoning

GPQA

Advanced science problem solving

89.4%

HLE

Broad expert-level exam set

32.1%

Coding & Engineering

SciCode

Scientific coding challenges

44.9%

Terminal-Bench Hard

Hard terminal task execution

35.6%

Instruction Following & Agent Tasks

IFBench

Prompt constraint adherence

79.2%

AA-LCR

Long-context reasoning

63%

τ²-Bench

Agent workflow tasks

95.0%

Metrics sourced from Artificial Analysis

Model Comparison

DeepSeek V4 Flash Article

Media and Discussions

Selected public videos and posts related to this model.

X (Twitter)

View post on X
View post on X
View post on X

Reddit

YouTube

Watch on YouTube
Watch on YouTube
Watch on YouTube

DeepSeek V4 Flash FAQ

DeepSeek V4 Flash: capabilities, use cases, limits, and TokenHub guidance.

How is DeepSeek V4 Flash positioned?+

DeepSeek V4 Flash is a DeepSeek model for fast, efficient reasoning and agent work.

Where does DeepSeek V4 Flash add value?+

Best for latency-sensitive applications, agent workflows and high-volume requests, especially when speed and cost efficiency is the priority.

What is DeepSeek V4 Flash's practical edge?+

Key strength: a smaller design with faster, more economical inference and switchable thinking and non-thinking modes.

Which constraint matters most?+

It has less headroom on the hardest reasoning and engineering tasks. For maximum answer quality, consider DeepSeek V4 Pro.

How do I integrate DeepSeek V4 Flash safely?+

Use the exact ID shown by TokenHub; follow your account docs and verify current features.