GPT Pricing
Language models
Multiple models, each with different capabilities and price points. Prices are per 1,000 tokens. You can think of tokens as pieces of words, where 1,000 tokens is about 750 words. This paragraph is 35 tokens.
GPT-4 Turbo
With 128k context, fresher knowledge and the broadest set of capabilities, GPT-4 Turbo is more powerful than GPT-4 and offered at a lower price.
Learn about GPT-4 Turbo
Model | Input | Output |
gpt-4-0125-preview | $0.01 / 1K tokens | $0.03 / 1K tokens |
gpt-4-1106-preview | $0.01 / 1K tokens | $0.03 / 1K tokens |
gpt-4-1106-vision-preview | $0.01 / 1K tokens | $0.03 / 1K tokens |
Vision pricing calculator
GPT-4
With broad general knowledge and domain expertise, GPT-4 can follow complex instructions in natural language and solve difficult problems with accuracy.
Learn about GPT-4
Model | Input | Output |
gpt-4 | $0.03 / 1K tokens | $0.06 / 1K tokens |
gpt-4-32k | $0.06 / 1K tokens | $0.12 / 1K tokens |
GPT-3.5 Turbo
GPT-3.5 Turbo models are capable and cost-effective.
gpt-3.5-turbo-1106 is the flagship model of this family, supports a 16K context window and is optimized for dialog.
gpt-3.5-turbo-instruct is an Instruct model and only supports a 4K context window.
Learn about GPT-3.5 Turbo
Model | Input | Output |
gpt-3.5-turbo-1106 | $0.0010 / 1K tokens | $0.0020 / 1K tokens |
gpt-3.5-turbo-instruct | $0.0015 / 1K tokens | $0.0020 / 1K tokens |
Assistants API
Assistants API and tools (retrieval, code interpreter) make it easy for developers to build AI assistants within their own applications. Each assistant incurs its own retrieval file storage fee based on the files passed to that assistant. The retrieval tool chunks and indexes your files content in our vector database.
Learn more
The tokens used for the Assistant API are billed at the chosen language model's per-token input / output rates and the assistant intelligently chooses which context from the thread to include when calling the model
Learn about Assistants API
Tool | Input |
Code interpreter | $0.03 / session |
Retrieval | $0.20 / GB / assistant / day (free until 03/01/2024) |
Fine-tuning models
Create your own custom models by fine-tuning our base models with your training data. Once you fine-tune a model, you’ll be billed only for the tokens you use in requests to that model.
Learn about fine-tuning
Model | Training | Input usage | Output usage |
gpt-3.5-turbo | $0.0080 / 1K tokens | $0.0030 / 1K tokens | $0.0060 / 1K tokens |
davinci-002 | $0.0060 / 1K tokens | $0.0120 / 1K tokens | $0.0120 / 1K tokens |
babbage-002 | $0.0004 / 1K tokens | $0.0016 / 1K tokens | $0.0016 / 1K tokens |
Embedding models
Build advanced search, clustering, topic modeling, and classification functionality with our embeddings offering.
Learn about embeddings
Model | Usage |
text-embedding-3-small | $0.00002 / 1K tokens |
text-embedding-3-large | $0.00013 / 1K tokens |
ada v2 | $0.00010 / 1K tokens |
Base models
GPT base models are not optimized for instruction-following and are less capable, but they can be effective when fine-tuned for narrow tasks.
Learn about GPT base models
Model | Usage |
davinci-002 | $0.0020 / 1K tokens |
babbage-002 | $0.0004 / 1K tokens |
Other models
Image models
Build DALL·E directly into your apps to generate and edit novel images and art. DALL·E 3 is the highest quality model and DALL·E 2 is optimized for lower cost.
Learn about image generation
Model | Quality | Resolution | Price |
DALL·E 3 | Standard | 1024×1024 | $0.040 / image |
Standard | 1024×1792, 1792×1024 | $0.080 / image | |
DALL·E 3 | HD | 1024×1024 | $0.080 / image |
HD | 1024×1792, 1792×1024 | $0.120 / image | |
DALL·E 2 | 1024×1024 | $0.020 / image | |
512×512 | $0.018 / image | ||
256×256 | $0.016 / image |
Audio models
Whisper can transcribe speech into text and translate many languages into English.
Text-to-speech (TTS) can convert text into spoken audio.
Model | Usage |
Whisper | $0.006 / minute (rounded to the nearest second) |
TTS | $0.015 / 1K characters |
TTS HD | $0.030 / 1K characters |
Please note that our Usage Policies require you to provide a clear disclosure to end users that the TTS voice they are hearing is AI-generated and not a human voice.