IKH

Chat Completions API

So far, you might have interacted with ChatGPT through the web interface provided by OpenAI, but similar results and outputs can be achieved using OpenAI APIs. In the upcoming segments, you will learn how to work with OpenAI models in a Python environment using the Jupyter Notebook/ Google Colab.

Application programming interfaces (APIs) are essential tools that enable different software applications, systems and services to communicate and interact with one another programmatically. They serve as bridges between different software components, allowing them to share data, functionality and services seamlessly. In this program, you will learn how to use APIs such as OpenAI APIs to build LLM-based applications. If you are curious to learn more about APIs in general, you can refer to this article.

The notebook used in the video can be accessed here.

In the video, the SME walked you through the steps to set up a working environment Google Colab for working with OpenAI APIs. Google Colab is a powerful cloud-based platform that allows data scientists, researchers, and developers to run python code and perform data analysis with the added advantage of using free GPUs and TPUs. One of the most significant advantages of using Google Colab is its seamless integration with Google Drive, allowing users to store, access, and share datasets, notebooks, and other files effortlessly. 

As mentioned in the video, use the following code to read files stored in your google drive:

Example

Python
from google.colab import drive
drive.mount('/content/drive')

Output

Alternatively, you can also upload files to the Google Colab environment from your personal computer. The code for this method is:

Example

Python
from google.colab import files
uploaded = files.upload()

Output

However, please note that accessing files through your google drive ensures that your files are readily available and also stored for later use. You can then load these files if your Colab notebook session or kernel terminates. The files that are uploaded manually files will be deleted when the runtime is terminated and you will lose any saved data.

Once you have set up your notebook, you can then install the official OpenAI Python library by running the following command:

Example

Python
pip install openai

Output

To work with OpenAI APIs, you will need to use a secret key. For detailed steps on how to create and work with secret keys, refer to this documentation given in the previous segment ‘OpenAI API Instructions and Best Practices’.

OpenAI offers different API endpoints for performing various NLP tasks. The most common API you will come across is the Chat Completions API, which is used for performing language tasks. In the next video, your SME will provide more details about the Chat Completions API offered by OpenAI. For the complete list of APIs provided by OpenAI, refer to this link.

As mentioned in the video, the Chat Completions API follows the format shown below.

Example

Python
## Basic Chat Completion request from OpenAI
chat_response = openai.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=message,
    max_tokens=200,
    temperature=0.5,
    n=1,
    stop=None,
    frequency_penalty=0,
    presence_penalty=0)

Output

The three main roles in the messages list are:

  • System: This instruction sets the overall behaviour of the assistant.
  • User: The user role represents the end user using the chatbot. 
  • Assistant: The assistant role represents the chatbot.

Note

You may find that while using the code, you receive a completely different output from what is shown in the video. This is because generative models, such as GPT-3.5 and GPT-4, are inherently stochastic and non-deterministic, that is, you may receive different outputs for the same set of input.

While working with the Chat Completions APIs, you need to pay attention to the following parameters, as mentioned in the documentation:

  • model: The GPT version and model you want to use.
  • max_tokens: Refers to the maximum number of tokens to be generated in the model’s response.
  • temperature: The sampling temperature is a number between 0 (most certain/deterministic) and 2 (most random) and defaults to 1; signifies the randomness in choosing the next tokens.
  • NOTE: In the video above, at 3:00 the SME mentions the temperature value ranges from 0 to 1. This is incorrect. The temperature value ranges from 0 to 2 and defaults to 1. The significance of temperature remains the same as explained by the SME.
  • n: The number of chat completion choices to generate for each input message.
  • stop: Up to 4 sequences where the API will stop generating further tokens; the returned text will not contain the stop sequence.
  • frequency_penalty and presence_penalty: These are used to reduce the likelihofod of sampling repetitive sequences of tokens. The recommended values for the penalty coefficients are approximately 0.1-1 if the aim is to just reduce repetitive tokens in the output response. If the aim is to strongly suppress repetition, one can increase the coefficients up to 2, but this results in decreased sample quality.

For more information about the various parameters used in the completions API, refer to the official API reference documentation here. 

Additionally, the conversation history, encoded as a JSON object, is provided to the model to configure the most likely response among a selection of generated chat completions. The Chat Completions API supports both single-turn conversation and multi-turn conversation and also supports some of the more recent models, such as GPT-3.5-turbo and GPT-4, which makes the Chat Completions API a de-facto endpoint for most NLP requirements of an application.

In the first example, we ask the model to generate text on a particular topic. The model is able to generalise and provide a detailed output on the topic of indexing in Pandas. Adding additional content and details to a prompt steers the model’s output to the desired output. You may refer to the following pagefor a list of such prompts and roles. 

We recommend using OpenAI Playground, a sandboxed environment, for testing the various models offered by OpenAI and view the effect of the model parameters on the output responses. The playground provides an isolated environment to test your prompts, validate the model outputs, and also supports exporting the prompts to Python code.

Now that you have an understanding of the parameters, you can answer the questions below.

In the next video, we will dive a bit deeper into this by asking the model to generate text on multiple topic by customising our prompts for dynamic inputs by creating a prompt template.

So far, you have used the Chat Completions API for a single-turn conversation. In the upcoming segment, you will work on multi-turn conversations.

Additional Readings:

  • This article explores how to use the temperature and top_n parameters for various tasks – Mastering Temperature and Top_p in ChatCPT API
  • With the advent of Generative AI models, prompt engineering has become a highly sought-after skill. You may refer to the following page to read about the prompts and roles that have been tried and tested by a community of experts. 
  • This article explains the nuances of the temperature and top_n parameters of OpenAI.

Report an error