Complete Guide to GPT-4 API [2024]

Guide Large Language Models

By: Kolena Editorial Team

Oct 10, 2024

What Is the GPT-4 API?

The GPT-4 API, offered by OpenAI, allows developers to integrate the advanced capabilities of the GPT-4 family of large language models (LLMs) into their applications. It provides a programmatic way to leverage OpenAI’s language models, including their ability to understand and generate human-like text, carry out natural conversation, translate between languages, analyze data, and generate code.

The GPT-4 API provides access to different OpenAI model versions, including GPT-4o, GPT-4 Turbo, and the previous-generation model GPT-3.5, enabling tailored performance for various use cases, from simple queries to processing extensive documents. At the time of this writing, all GPT-4 models provide a context window of 128K tokens.

The API supports not just text but also multimodal inputs, including files, images, and audio, which enhances its versatility. This capability is crucial for applications that require understanding and generating content across different types of media.

This is part of a series of articles about Large Language Models

What ChatGPT Models Are Available via the API?

As of the time of this writing, the OpenAI API provides access to the following LLMs, which also power the popular ChatGPT platform:

GPT-4o: This is the latest model. It supports a context length of up to 128k tokens, suitable for processing long documents such as novels. GPT-4o can handle text, image, and audio inputs, making it versatile for complex tasks requiring multimodal data.
GPT-4o mini: A lighter version of GPT-4o, this model maintains the same 128k token context length and multimodal capabilities but is optimized for faster performance with a smaller computational footprint.
GPT-4 Turbo: The standard version of GPT-4, also supporting 128k token context length. It handles text, image, and audio inputs, making it suitable for a wide range of applications that require high intelligence and multimodal input.
GPT-3.5: Designed for simpler, routine tasks, this model supports a shorter context length of 16k tokens. It processes text inputs and outputs efficiently and is ideal for tasks that do not require the capabilities of GPT-4 models.

OpenAI API Endpoints

Here are the API endpoints available for GPT 4:

Endpoint Name	Endpoint URL	Functionality	Example Request
Completions	`/v1/completions`	Generates text based on a given prompt. Useful for writing assistance, content creation, and general-purpose text generation.	Request: `POST /v1/completions` Body: `{ "model": "gpt-4o", "prompt": "Write a short story about a courageous cat.", "max_tokens": 100 }`
Chat	`/v1/chat/completions`	Generates conversational responses, enabling applications to engage in dialogue with users.	Request: `POST /v1/chat/completions` Body: `{ "model": "gpt-4o", "messages": [{"role": "user", "content": "How do you bake a cake?"}], "max_tokens": 50 }`
Edits	/`v1/edits`	Edits or refines provided text. Useful for improving grammar, tone, or content clarity.	Request: `POST /v1/edits<br>` Body: `{ "model": "gpt-4o", "input": "She walk to the store.", "instruction": "Correct the sentence.", "temperature": 0.5 }`
Images	`/v1/images/generations`	Generates images from textual descriptions, useful for visual content creation and artistic applications.	Request: `POST /v1/images/generations<br>` Body: `{ "model": "dalle-2", "prompt": "A fantasy castle on a hill at sunset", "n": 1, "size": "1024x1024" }`
Embeddings	`/v1/embeddings`	Generates vector representations of text for tasks like semantic search and clustering.	Request: `POST /v1/embeddings` Body: `{ "model": "text-embedding-ada-002", "input": "Understanding machine learning" }`
Files	`/v1/files`	Manages files for use with fine-tuning and other API services. Supports uploading, listing, and deleting files.	Request: `POST /v1/files/upload` Body: `Form-data with file upload, e.g., a .jsonl file for fine-tuning`
Moderation	`/v1/moderations`	Analyzes text for inappropriate content, aiding in content moderation.	Request: `POST /v1/moderations` Body: `{ "input": "Text content to be checked", "model": "text-moderation-latest" }`

Accessing GPT-4 APIs

Here are the general steps involved in accessing the various GPT models through the OpenAI API:

Sign up and payment: Create an OpenAI API account and complete a payment of $5 or more to unlock usage tier 1. This payment grants access to the GPT-4, GPT-4 Turbo, GPT-4o, and GPT-4o Mini models. When you start using the APIs, your usage tier might be increased, which influences rate limits (see below).
Choosing the API: Depending on your needs, you can choose from multiple API endpoints, each providing different LLM functionality (as described above).
Using the playground: For testing and experimentation, you can start with the OpenAI Playground. It allows you to interact with the models in a user-friendly interface, making it easier to understand their capabilities and adjust parameters before integrating them into your application.
Integration: Once familiar with the models, integrate the chosen API into your application. The OpenAI API Reference provides detailed instructions and examples to help you get started with API calls, handling responses, and optimizing performance for your specific use case.

GPT API Rate Limits

Rate limits are restrictions imposed on the number of requests or tokens a user can process within a specified period. These limits ensure fair usage and prevent abuse of the API services.

OpenAI measures rate limits in several ways:

RPM (requests per minute): Limits the number of API requests that can be made in a minute.
RPD (requests per day): Limits the total number of API requests that can be made in a day.
TPM (tokens per minute): Limits the number of tokens processed per minute.
TPD (tokens per day): Limits the total number of tokens processed per day.
IPM (images per minute): Limits the number of images processed per minute.

For example, if your RPM limit is 20, you could send 20 requests per minute, even if you don’t hit the token limit. Conversely, you might hit a token limit before the request limit, such as processing 150,000 tokens with only 15 requests.

Rate and usage limits can be viewed in account settings under the Limits section. Usage tiers automatically adjust based on API usage and spending, generally increasing rate limits across models as you move up tiers:

Tier	Qualification	Usage Limits
Free	Allowed geography	$100 / month
Tier 1	$5 paid	$100 / month
Tier 2	$50 paid + 7 days	$500 / month
Tier 3	$100 paid + 7 days	$1,000 / month
Tier 4	$250 paid + 14 days	$5,000 / month
Tier 5	$1,000 paid + 30 days	$50,000 / month

As of the time of this writing, these are the API rate limits for the primary OpenAI models:

	gpt-3.5-turbo	dall-e-3	gpt-4o	gpt-4o-mini	gpt-4-turbo
Free_Tier_RPM	3	1 img/min	–	–	–
Free_Tier_RPD	200	–	–	–	–
Free_Tier_TPM	40,000	–	–	–	–
Free_Tier_Batch_Queue_Limit	200,000	–	–	–	–
Tier1_RPM	3,500	5 img/min	500	500	500
Tier1_RPD	10,000	–	–	10,000	–
Tier1_TPM	200,000	–	30,000	200,000	30,000
Tier1_Batch_Queue_Limit	2,000,000	–	90,000	2,000,000	90,000
Tier2_RPM	3,500	7 img/min	5,000	5,000	5,000
Tier2_TPM	2,000,000	–	450,000	2,000,000	450,000
Tier2_Batch_Queue_Limit	5,000,000	–	1,350,000	20,000,000	1,350,000
Tier3_RPM	3,500	7 img/min	5,000	5,000	5,000
Tier3_TPM	4,000,000	–	800,000	4,000,000	600,000
Tier3_Batch_Queue_Limit	100,000,000	–	50,000,000	40,000,000	40,000,000
Tier4_RPM	10,000	15 img/min	10,000	10,000	10,000
Tier4_TPM	10,000,000	–	2,000,000	10,000,000	800,000
Tier4_Batch_Queue_Limit	1,000,000,000	–	200,000,000	1,000,000,000	80,000,000
Tier5_RPM	10,000	200 img/min	10,000	30,000	10,000
Tier5_TPM	50,000,000	–	30,000,000	150,000,000	2,000,000
Tier5_Batch_Queue_Limit	10,000,000,000	–	5,000,000,000	15,000,000,000	300,000,000

GPT-4 API Pricing

GPT-4o Pricing

Standard pricing

Input tokens: $5.00 per 1M tokens
Output tokens: $15.00 per 1M tokens

Batch API pricing

Input tokens: $2.50 per 1M tokens
Output tokens: $7.50 per 1M tokens

GPT-4o Mini Pricing

GPT-4o Mini offers a more cost-efficient alternative with similar capabilities in a lighter package.

Standard pricing

Input tokens: $0.150 per 1M tokens
Output tokens: $0.600 per 1M tokens

Batch API pricing

Input tokens: $0.075 per 1M tokens
Output tokens: $0.300 per 1M tokens

Pricing for Embedding Models

The embedding models are priced to support search, clustering, and classification functions.

text-embedding-3-small

Standard pricing: $0.020 per 1M tokens
Batch API pricing: $0.010 per 1M tokens

text-embedding-3-large

Standard pricing: $0.130 per 1M tokens
Batch API pricing: $0.065 per 1M tokens

ada v2

Standard pricing: $0.100 per 1M tokens
Batch API pricing: $0.050 per 1M tokens

Pricing for Fine-tuning Models

Fine-tuning models allow customization to meet specific application needs, with pricing as follows. At the time of this writing, fine tuning for GPT-4o and GPT-4 Turbo is available as early access. You can request access in the fine tuning UI.

gpt-4o-mini-2024-07-18

Standard pricing

Input tokens: $0.30 per 1M tokens
Output tokens: $1.20 per 1M tokens
Training tokens: $3.00 per 1M tokens

Batch API pricing

Input tokens: $0.15 per 1M tokens
Output tokens: $0.60 per 1M tokens

gpt-3.5-turbo

Standard pricing

Input tokens: $3.00 per 1M tokens
Output tokens: $6.00 per 1M tokens
Training tokens: $8.00 per 1M tokens

Batch API pricing

Input tokens: $1.50 per 1M tokens
Output tokens: $3.00 per 1M tokens

Tutorial: Getting Started with OpenAI API

Account Setup

To get started with the OpenAI API, first create an OpenAI account or sign in. Navigate to the API key page and select Create new secret key, optionally naming the key. Make sure to save this somewhere safe and do not share it with anyone.

Setting Up Python

OpenAI provides a custom Python library which makes working with the OpenAI API in Python simple and efficient.

To use the OpenAI Python library, ensure you have Python installed. To download Python, visit the official Python website and download the latest version. You need at least Python 3.7.1 or newer.

Installing the OpenAI Python Library

Once you have Python 3.7.1 or newer installed and (optionally) set up a virtual environment, the OpenAI Python library can be installed. From the terminal / command line, run:

pip install --upgrade openai

Running pip list will show you the installed Python libraries, confirming that the OpenAI Python library was successfully installed.

Sending Your First API Request

After configuring Python and setting up your API key, the final step is to send a request to the OpenAI API using the Python library. Create a file named openai-test.py using the terminal or an IDE.

Inside the file, copy and paste the example below:

from openai import OpenAI
client = OpenAI()

completion = client.chat.completions.create(
  model="gpt-4o-mini",
  messages=[
`{"role": "system", "content": "You are a positive assistant, skilled in explaining technical concepts while making users feel better."},`  
`{"role": "user", "content": "Explain the concept of horizontal scalability."}`
 ]
)

print(completion.choices[0].message)

To run the code, enter python openai-test.py into the terminal / command line.

The response should look something like this:

{
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": "Horizontal scalability refers to the ability of a system to increase capacity by connecting multiple hardware or software entities so that they work as a single logical unit. Imagine you could scale your software as much as you need without any limitation, what a great feeling!"
      }
    }
  ]
}

AI Testing & Validation with Kolena

Kolena is an AI/ML testing & validation platform that solves one of AI’s biggest problems: the lack of trust in model effectiveness. The use cases for AI are enormous, but AI lacks trust from both builders and the public. It is our responsibility to build that trust with full transparency and explainability of ML model performance, not just from a high-level aggregate ‘accuracy’ number, but from rigorous testing and evaluation at scenario levels.

With Kolena, machine learning engineers and data scientists can uncover hidden machine learning model behaviors, easily identify gaps in the test data coverage, and truly learn where and why a model is underperforming, all in minutes not weeks. Kolena’s AI / ML model testing and validation solution helps developers build safe, reliable, and fair systems by allowing companies to instantly stitch together razor-sharp test cases from their data sets, enabling them to scrutinize AI/ML models in the precise scenarios those models will be unleashed upon the real world. Kolena platform transforms the current nature of AI development from experimental into an engineering discipline that can be trusted and automated.

Learn more about Kolena

What Is the GPT-4 API?

What ChatGPT Models Are Available via the API?

OpenAI API Endpoints

Accessing GPT-4 APIs

GPT API Rate Limits

GPT-4 API Pricing

GPT-4o Pricing

GPT-4o Mini Pricing

Pricing for Embedding Models

Pricing for Fine-tuning Models

gpt-4o-mini-2024-07-18

gpt-3.5-turbo

Tutorial: Getting Started with OpenAI API

Account Setup

Setting Up Python

Installing the OpenAI Python Library

Sending Your First API Request

Related Guides