OpenAI has introduced a new generation of language models—GPT-4.1, GPT-4.1 mini, and GPT-4.1 nano—engineered for enhanced performance in software development and complex instruction tasks. These models are available through OpenAI’s API and support a 1-million-token context window, enabling them to process vast volumes of information—roughly equivalent to 750,000 words—in a single prompt.
Aimed at Building AI Software Engineers
With GPT-4.1, OpenAI takes a decisive step toward its long-term vision of creating an “agentic software engineer”—a model capable of end-to-end software development, including frontend coding, QA, bug fixes, and documentation. According to the company, GPT-4.1 has been finely tuned based on developer feedback to deliver improvements in formatting, tool usage, and consistency.
OpenAI’s internal benchmarks show that GPT-4.1 outperforms previous models like GPT-4o in real-world software tasks. The mini and nano variants offer greater speed and lower cost, with trade-offs in output accuracy. Pricing starts at $0.10 per million input tokens for nano, making it OpenAI’s cheapest model to date.
Competitive Performance in Industry Benchmarks
On the SWE-bench Verified coding benchmark, GPT-4.1 scored between 52% and 54.6%, just below Google’s Gemini 2.5 Pro (63.8%) and Anthropic’s Claude 3.7 Sonnet (62.3%). However, GPT-4.1 excelled in multimodal tasks, reaching 72% accuracy on the “long, no subtitles” video category in the Video-MME benchmark.
Despite these strengths, OpenAI notes limitations: model performance declines as token input increases. For example, accuracy dropped from 84% at 8,000 tokens to 50% at the 1 million token limit, revealing challenges in managing long-context consistency.
Lower Cost, More Power for Developers
GPT-4.1 is priced at $2 per million input tokens and $8 per million output tokens. The mini and nano versions scale down cost and size to support lightweight applications, while still offering solid performance for real-time use cases in development environments.
The new models are not yet available in ChatGPT but are expected to play a central role in OpenAI’s enterprise offerings, as companies continue integrating generative AI tools into software pipelines.