POST /v1/chat/completionsGemini 3.5 Flash
gemini-3.5-flashGemini 3.5 Flash is described by Google as a fast, cost-efficient frontier model for real-world agentic tasks. Official materials emphasize stronger coding, multi-step execution, multimodal reasoning, and long-context ability, while keeping latency and price lower than larger flagship models. It should be positioned as a high-speed agent model, not just a cheap chat model.
Total Context
1Mtokens
Max Output
65.5Ktokens
Released
May 19, 2026
Modalities
Gemini 3.5 Flash Price
| Input Price | Output Price | Cache Read | Cache Create 5m |
|---|---|---|---|
| $1.5/M | $9/M | $0.15/M | $0.0833/M |
Gemini 3.5 Flash API
Gemini 3.5 Flash Benchmark
34.9
/100
Artificial Analysis Intelligence Index
Artificial Analysis broad capability aggregate
Index score
47.1
/100
Artificial Analysis Coding Index
Artificial Analysis software task aggregate
Index score
Knowledge & Reasoning
GPQA
Advanced science problem solving
82.8%
HLE
Broad expert-level exam set
23.1%
Coding & Engineering
SciCode
Scientific coding challenges
48.8%
Terminal-Bench Hard
Hard terminal task execution
46.2%
Instruction Following & Agent Tasks
IFBench
Prompt constraint adherence
47.3%
AA-LCR
Long-context reasoning
53.3%
τ²-Bench
Agent workflow tasks
58.8%
Metrics sourced from Artificial Analysis
Frequently asked questions about Gemini 3.5 Flash
Understand what Gemini 3.5 Flash is, its best uses, distinguishing strengths, practical tradeoffs, and safe TokenHub integration guidance.
How should developers understand the role of Gemini 3.5 Flash?+
Gemini 3.5 Flash is Google’s current Flash model for fast, scalable agentic and multimodal workloads. It is a current public model in its provider’s documentation, though availability can vary by platform.
When does Gemini 3.5 Flash deliver the most practical value?+
Best-fit scenarios include high-volume agent loops and sub-agent orchestration, difficult software-engineering tasks, and analysis of text and visual inputs. Test representative inputs and define measurable acceptance criteria before production.
What are the most useful characteristics of Gemini 3.5 Flash?+
Key strengths include a strong balance of quality, speed, and cost, fast response times, and reliable execution of multi-step agent workflows. This combination is especially useful for difficult software-engineering tasks.
What are the practical limits of Gemini 3.5 Flash?+
Consider another model when the task needs the strongest Pro-tier reasoning, the application needs this text model to return generated images directly, or the workflow cannot include human review for important decisions. Verify important factual, legal, financial, medical, or operational outputs with qualified human review.
How should developers call Gemini 3.5 Flash through TokenHub?+
In TokenHub, select the exact model identifier displayed for Gemini 3.5 Flash, use the endpoint documented for your account, and authenticate with your TokenHub credentials. Confirm the TokenHub-exposed input types, tools, grounding options, and model lifecycle rather than assuming full Gemini API parity.
Media and Discussions
Selected public videos and posts related to this model.
X (Twitter)
Reddit
YouTube