Using OpenAI APIs Solving Sentiment Analysis Problems in 20 Lines of Code

The simplest way to handle sentiment analysis problems using large language models is to leverage its Embedding API. This API can turn any text segment you specify into a vector, i.e., a set of fixed-length parameters, to represent any text segment under the large language model.

First, we need to calculate the embedding of the words “positive review” and “negative review.” The Embedding API lets you get the vector space values for any given text. Then, cosine similarity calculates the “distance” between the text and the given words. By calculating the similarity between the text Embedding and “positive review” minus the similarity between that text Embedding and “negative review,” we will get a final score. If this score is greater than 0, then your text is closer in “distance” to a “positive review,” so it can be judged that the text is most likely a “positive review”; otherwise, it may be a “negative review.”

Below, I will use this method to analyze two Amazon Lego toy reviews.

Below, I will use this method to analyze two Amazon Lego toy reviews.

The code used for sentiment analysis is only 20 lines. Let’s see if it can quickly perform sentiment analysis on these two reviews:


import numpy as np
from openai import OpenAI
import os

client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
EMBEDDING_MODEL = "text-embedding-ada-002"
def get_embedding(text, model):
    text = text.replace("\\n", " ")
    return client.embeddings.create(input=[text], model=model).data[0].embedding
def cosine_similarity(a, b):
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))
# Get Positive Review And Negative Review's Embedding
positive_review = get_embedding("Positive Review", EMBEDDING_MODEL)
negative_review = get_embedding("Negative Review", EMBEDDING_MODEL)
positive_example = get_embedding(
    "Overall, packaging was very nice. Purchased this as a gift for a kid's birthday party. Loved the detachable book on the front of the box. The kid should be extremely pleased with the item.", EMBEDDING_MODEL)
negative_example = get_embedding(
    "I purchased this product for a birthday gift. The Lego box was shipped in a bag and the box was all dented and damaged. No time to make a return as it was a gift.", EMBEDDING_MODEL)
def get_score(sample_embedding):
    return cosine_similarity(sample_embedding, positive_review) - cosine_similarity(sample_embedding, negative_review)
positive_score = get_score(positive_example)
negative_score = get_score(negative_example)
print("Positive rating : %f" % (positive_score))
print("Negative rating : %f" % (negative_score))

Result:


Positive rating : 0.080185
Negative rating : -0.054586

As we expected, the positive review of the product obtained a score greater than 0 through the Embedding similarity calculation, while the negative review scored less than 0.

Isn’t this a particularly simple way? Let me take the previous sentences evaluating coffee as an example to see if it works equally well.


import numpy as np
from openai import OpenAI
import os

client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
EMBEDDING_MODEL = "text-embedding-ada-002"
def get_embedding(text, model=EMBEDDING_MODEL):
    text = text.replace("\\n", " ")
    return client.embeddings.create(input=[text], model=model).data[0].embedding
def cosine_similarity(a, b):
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))
# Get Your API Key
# Get Positive Review And Negative Review's Embedding
positive_review = get_embedding("Positive Review")
negative_review = get_embedding("Negative Review")
positive_example = get_embedding(
    "The coffee at this cafe is exceptional, hardly disappointing.")
negative_example = get_embedding(
    "The coffee at this cafe is disappointing, hardly exceptional.")
def get_score(sample_embedding):
    return cosine_similarity(sample_embedding, positive_review) - cosine_similarity(sample_embedding, negative_review)
positive_score = get_score(positive_example)
negative_score = get_score(negative_example)
print("Positive rating : %f" % (positive_score))
print("Negative rating : %f" % (negative_score))

Result:


Positive rating : 0.040689
Negative rating : -0.116913

Similarly, we got the right results.

Real Examples on Larger Datasets

The above examples seem to work well. Could this be a coincidence? Let’s take a real dataset to verify and use the OpenAI Embedding API for sentiment analysis to see if we can get the expected results.

The following code is from a sample in the OpenAI Cookbook. It uses the same method to judge user reviews of some foods Amazon provides. In this review data, there is the review content and the star ratings users give to these foods. The star rating data can indirectly reflect whether our sentiment analysis method is accurate. We consider reviews with 1–2 stars as negative and 4–5 stars as positive.

First, I load this dataset with a CSV extension into memory using Pandas, and to avoid unnecessary API call consumption, the values that have been converted to Embedding vectors are saved in this dataset without recalculating.


import pandas as pd
import numpy as np

from sklearn.metrics import classification_report
datafile_path = "./data/fine_food_reviews_with_embeddings_1k.csv"
df = pd.read_csv(datafile_path)
df["embedding"] = df.embedding.apply(eval).apply(np.array)
# convert 5-star rating to binary sentiment
df = df[df.Score != 3]
df["sentiment"] = df.Score.replace({1: "negative", 2: "negative", 4: "positive", 5: "positive"})

For each review, I will compare it with a pre-defined “positive review” and “negative review” using the previous method and then see which review is closer in “distance” to the “positive review” or “negative review.” The “positive review” and “negative review” label texts I defined here are slightly longer, which are “An Amazon review with a negative sentiment.” and “An Amazon review with a positive sentiment.” respectively.

After calculating the results, I use the scikit-learn machine learning library to compare the predicted values with the actual user star rating data and then output the comparison results.


from sklearn.metrics import PrecisionRecallDisplay

def evaluate_embeddings_approach(
    labels = ['negative', 'positive'],
    model = EMBEDDING_MODEL,
):
    label_embeddings = [get_embedding(label, model=model) for label in labels]
    def label_score(review_embedding, label_embeddings):
        return cosine_similarity(review_embedding, label_embeddings[1]) - cosine_similarity(review_embedding, label_embeddings[0])
    probas = df["embedding"].apply(lambda x: label_score(x, label_embeddings))
    preds = probas.apply(lambda x: 'positive' if x>0 else 'negative')
    report = classification_report(df.sentiment, preds)
    print(report)
    display = PrecisionRecallDisplay.from_predictions(df.sentiment, probas, pos_label='positive')
    _ = display.ax_.set_title("2-class Precision-Recall curve")
evaluate_embeddings_approach(labels=['An Amazon review with a negative sentiment.', 'An Amazon review with a positive sentiment.'])

Result:


              precision    recall  f1-score   support
    negative       0.98      0.73      0.84       136
    positive       0.96      1.00      0.98       789
    accuracy                           0.96       925
   macro avg       0.97      0.86      0.91       925
weighted avg       0.96      0.96      0.96       925

From the results, I can see that through this simple way to judge positive and negative reviews, the positive and negative precision reached 0.98 and 0.96, respectively, over 95%.

In terms of recall, which is recall in the chart above, the performance on negative reviews is slightly worse, only 73%, indicating that there are still quite a few negative reviews judged as positive reviews. However, the recall rate for positive reviews reached 100%, meaning the model found all positive reviews. The overall accuracy rate is 96%, considered very high in machine learning. To achieve such accuracy, we only need to call the Open API Embedding interface and add a few lines of code to calculate the similarity between vectors.

Colab Example

https://colab.research.google.com/drive/1joYlOiKG3hHEVzFl18i1oSvgzvgNdHHz?usp=sharing

Kaggle Example

https://www.kaggle.com/fenixping/ai-aesthetics-chap02

GitHub Example

ai_aesthetics/chap02/chap02.ipynb at main · yipingw/ai_aesthetics (github.com)

If you want to learn more, follow my book “ChatGPT’s Guide to AI Mastery.”