GPT-4.1

gpt-4.1

GPT-4.1 is an OpenAI model generation focused on improved coding, instruction following, and long-context performance. Official announcements present it as a stronger developer model than GPT-4o for many programming and instruction-heavy tasks. Its catalog description should highlight practical coding reliability and long-context understanding.

Total Context

1Mtokens

Max Output

32.8Ktokens

Released

Apr 14, 2025

Modalities

GPT-4.1 Price

Input PriceOutput PriceCache Read
$2/M$8/M$0.5/M

GPT-4.1 API

POST /v1/messages

GPT-4.1 Benchmark

GPT-4.1

19.4

/100

Artificial Analysis Intelligence Index

Artificial Analysis broad capability aggregate

Index score

21.8

/100

Artificial Analysis Coding Index

Artificial Analysis software task aggregate

Index score

34.7

/100

Artificial Analysis Math Index

Artificial Analysis math reasoning aggregate

Index score

Knowledge & Reasoning

MMLU-Pro

Advanced multi-task knowledge

80.6%

GPQA

Advanced science problem solving

66.6%

HLE

Broad expert-level exam set

4.6%

Coding & Engineering

LiveCodeBench

Live coding problems

45.7%

SciCode

Scientific coding challenges

38.1%

Terminal-Bench Hard

Hard terminal task execution

13.6%

Math

MATH-500

Advanced math problem solving

91.3%

AIME

Competition math problems

43.7%

AIME 2025

Competition math problems

34.7%

Instruction Following & Agent Tasks

IFBench

Prompt constraint adherence

43.0%

AA-LCR

Long-context reasoning

61%

τ²-Bench

Agent workflow tasks

47.1%

Metrics sourced from Artificial Analysis

Media and Discussions

Selected public videos and posts related to this model.

X (Twitter)

View post on X
View post on X
View post on X

Reddit

YouTube

Watch on YouTube
Watch on YouTube
Watch on YouTube

Frequently asked questions about GPT-4.1

Understand what GPT-4.1 is, its best uses, distinguishing strengths, practical tradeoffs, and safe TokenHub integration guidance.

What is GPT-4.1, and where does it fit in OpenAI’s model lineup?+

GPT-4.1 is a high-capability, non-reasoning GPT model focused on instruction following, tool use, and long-context work. It has been retired from ChatGPT, while API availability may remain; check TokenHub’s current listing.

Which workloads are the best fit for GPT-4.1?+

Best-fit scenarios include working across large codebases, strict instruction following, and tool-enabled application workflows. Test representative inputs and define measurable acceptance criteria before production.

Why might a team select GPT-4.1 over a smaller or older model?+

Key strengths include strong handling of long context, reliable adherence to detailed instructions, and effective use of tools and function calls. This combination is especially useful for strict instruction following.

What should be validated before relying on GPT-4.1?+

Consider another model when the task needs the deepest deliberate reasoning, very low latency is the main requirement, or the workflow cannot include human review for important decisions. Run generated code through tests, security checks, and human review before merging or deployment.

What is the practical TokenHub setup guidance for GPT-4.1?+

In TokenHub, select the exact model identifier displayed for GPT-4.1, use the endpoint documented for your account, and authenticate with your TokenHub credentials. Confirm whether the TokenHub entry exposes the input types, tool behavior, and output controls your application needs.