Sentiment Analysis using OpenAI GPT-3 API

In this article, we will cover on how to perform sentiment analysis using OpenAI GPT-3 API and Google Colab.

Darren Ong · 4 minute read

Introduction

Sentiment Analysis is used to identify if a piece of text is positive, negative or neutral. Some of the use cases for Sentiment Analysis can include:

  • Monitoring social media to track how people feel about a certain brand and topic.
  • Quickly discover negative customer support tickets that have been submitted and address them quickly.

In this tutorial, we will be covering on how we can use the OpenAI GPT-3 API to perform sentiment analysis with minimal amount of code and setup.

Topics to be covered

In this tutorial, we will cover how we can train the OpenAI GPT-3 API model using the Sentiment140 dataset from Kaggle using Jupyter Notebook on Colab and perform a prediction.

Download Required Dependencies

!pip install kaggle pandas jsonlines openai

Download Kaggle Dataset

In order to download the Kaggle dataset directly from Jupyter Notebook, we can use the kaggle command line tool. Before being able to do so, we will need to get the Kaggle API Key.

  1. Go to the Kaggle User Profile page, click the profile picture and select "Your Profile".

Kaggle User Profile Page

  1. Click the "Account" Tab and click the "Create New API Token". Your browser will auto download the "kaggle.json" file that will contain the Kaggle credentials.

In the Google Colab notebook, upload the kaggle.json file by right clicking in the "Files" tab and clicking the "Upload" button.

Colab Drag and Drop

Add the following piece of code into the notebook. This piece of code shall create the .kaggle folder in the current project directory and copy the kaggle.json that have been uploaded into the .kaggle folder.

!mkdir ~/.kaggle
!cp kaggle.json ~/.kaggle/
!chmod 600 ~/.kaggle/kaggle.json

After placing the credentials into the notebook, we can run this code to download the dataset from Google Colab.

!kaggle datasets download kazanova/sentiment140
!unzip sentiment140.zip

Transforming the dataset

Before uploading the sentiment dataset to OpenAI, we will need to transform the dataset into a jsonl file with the format:

{text:'Sentiment Text', label:'Positive|Negative|Neural'}

Read the file using pandas read_csv

import pandas as pd
emotion_df = pd.read_csv('training.1600000.processed.noemoticon.csv', names=['label', 'id', 'date', 'flag', 'user', 'text'], encoding='ISO-8859-1')
# Visualize first 5 rows
emotion_df.head()

In the current dataset, the sentiment label will be 0, 2 or 4. Negative will be label 0, Neutral will be 2 and Positive will be 4.

Kaggle Labelled Dataset Sentiment

Run this code to transform the number label to the corresponding text label (Negative, Neutral or Positive)

def convert_labels(label):
  if label == 0:
    return 'Negative'
  if label == 2:
    return 'Neutral'
  if label == 4:
    return 'Positive'
emotion_df['label'] = emotion_df['label'].apply(convert_labels)

We need to convert the label text examples into jsonl format using the code below:

import jsonlines
with jsonlines.open('train.jsonl', mode='w') as writer:
  for row in emotion_df.itertuples():
    writer.write({
      'text': row[6],
      'label': row[1]
    })

Get the OpenAI API Key

After successfully preparing the jsonl file that will used as the prelabelled dataset, we will need to upload the prelabelled file to OpenAI GPT-3 API using the OpenAPI API and make another call to perform classification on a text.

Register for OpenAI account

Before using the OpenAI GPT-3 API, we need to get an API key. Register for an OpenAI account https://beta.openai.com/signup.

Get the API Key

  1. Go to [https://beta.openai.com/account/api-keys].(https://beta.openai.com/account/api-keys)
  2. Copy the "Secret Key" by clicking the "Copy" button.

Reveal API Key for OpanAI API

Upload file to OpenAI API

In the code below, we specify the OpenAI API Key and upload the file to OpenAI using openai.File.create() method. Documentation for the file API can be found here

import openai
openai.api_key='<API_KEY>'
result = openai.File.create(file=open("train.jsonl"), purpose="classifications")

Performing classification

In order to perform the sentiment analysis, we can call the openai.Classification.create method to successfully make a prediction.

filename = result['id']
openai.Classification.create(
  search_model="ada",
  model="curie",
  file=filename,
  query="Crypto is crashing hard",
  labels=["Positive", "Negative", "Neutral"],
)
predicted_label = prediction['label']
print('Predicted label is: {}'.format(predicted_label))

Conclusion

The OpenAI GPT-3 API allows for developers to quickly build a sentiment tools to automatically identify the sentiment label for a given text. Besides that, GPT-3 API also has the ability to perform other tasks such as Question Answering and Search.

The full code on Google Colab can be found https://colab.research.google.com/drive/1svU2R5aIqjKf_7tQZrAi2Vs7vso-yqvS?usp=sharing

python
AI