Showing posts with label chatgpt. Show all posts
Showing posts with label chatgpt. Show all posts

Thursday, August 3, 2023

Using ChatGPT for real to interpret text

What's real and what isn't with ChatGPT?

There's a huge amount of hype surrounding ChatGPT and I've heard all kinds of "game changing" stories around it. But what's real and what's not?

In this blog post, I'm going to show you one of the real things ChatGPT can do: extract meaning from text. I'll show you how well it performs, discuss some of its shortcomings, and highlight important considerations for using it in business. I'm going to do it with real code and real data.

We're going to use ChatGPT to extract meaning from news articles, specifically, two articles on the Women's World Cup.

D J Shin, CC BY-SA 3.0, via Wikimedia Commons. I for one, welcome our new robot overlords...

The Women's World Cup

At the time of writing, the Women's World Cup is in full swing and England have just beaten China 6-1. There were plenty of news stories about it, so I took just two and tried to extract structured, factual data from the articles.

Here are the two articles:

Here is the data I wanted to pull out of the text:
  • The sport being played
  • The competition
  • The names of the teams
  • Who won
  • The score
  • The attendance
I wanted it in a structured format, in this case, JSON.

Obviously, you could read the articles and just extract the information, but the value of ChatGPT is doing this at scale, to scan thousands or millions of articles to search for key pieces of information. Up until now, this has been done by paying people in the developing world to read articles and extract data. ChatGPT offers the prospect of slashing the cost of this kind of work and making it widely available.

Let's see it in action.

Getting started

This example is all in Python and I'm assuming you have a good grasp of the language.

Download the OpenAI library:

pip install openai

Register for OpenAI and get an API key. At the time of writing, you get $5 in free credits and this tutorial won't consume much of that $5.

You'll need to set your API key in your code. To get going, we'll just paste it into our Python file:

import openai
openai.api_key = "YOUR_KEY"

You should note that OpenAI will rescind any keys they find on the public internet. My use of the key in code is very sloppy from a security point of view. Only do it to get started.

Some ChatGPT basics

We're going to focus on just one part of ChatGPT, the ChatCompletion API. Because there's some complexity here, I'm going to go through some of the background before diving into the code.

To set the certainty of its answers, ChatGPT has a concept of "temperature". This is a parameter that sets how "sure" the answer is; the lower the number the more sure the answer. A more certain answer comes at the price of creativity, so for some applications, you might want to choose a higher temperature (for example, you might want a higher temperature for a chatbot). The temperature range is 0 to 1, and we'll use 0 for this example because we want highly reliable analysis.

There are several ChatGPT models each with a different pricing structure. As you might expect, the larger and more recent models are more expensive, so for this tutorial, I'm going to use an older and cheaper model, "gpt-3.5-turbo", that works well enough to show what ChatGPT can do.

ChatGPT works on a model of "roles" and "messages". Roles are the actors in a chat; for a chatbot there will be a "user" role, which is the human entering text, an "assistant" role which is the chat response, and a "system" role controlling the assistant. Messages are the text from the user or the assistant or a "briefing" for the system. For a chatbot, we need multiple messages, but to extract meaning from text, we just need one. To analyze the World Cup articles, we only need the user role.

To get an answer, we need to pose a question or give ChatGPT an instruction on what to do. That's part of the "content" we set in the messages parameter. The content must contain the text we want to analyze and instructions on what we want returned. This is a bigger topic and I'm going to dive into it next.

Prompt engineering part 1

Setting the prompt correctly is the core of ChatGBP and it's a bit of an art, which is why it's been called prompt engineering. You have to very carefully write your prompt to get the results you expect.

Oddly, ChatGPT doesn't separate the text from the query; they're all bundled together in the same prompt. This means you have to clearly tell ChatGPT what you want to analyze and how you want it analyzed.

Let's start with a simple example, let's imagine you want to know how many times the letter "e" occurs in the text "The kind old elephant." Here's how you might write the prompt:

f"""In the following text, how often does the letter e occur:

"The kind old elephant"

"""

This gives us the correct answer (3). We'll come back to this prompt later because it shows some of the pitfalls of working with ChatGPT. In general, we need to be crystal clear about the text we want the system to analyze.

Let's say we wanted the result in JSON, here's how we might write the prompt:

f"""

In the following text, how often does the letter e occur, write your answer as JSON:

"The kind old elephant"

"""

Which gives us {"e": 3}

We can ask more complex questions about some text, but we need to very carefully layout the query and distinguish between text and questions. Here's an example.

prompt = f"""

In the text indicated by three back ticks answer the \

following questions and output your answer as JSON \

using the key names indicated by the word "key_name" \

1) how often does the letter e occur key_name = "letter" \

2) what animal is referred to key_name = "animal" \

```The kind old elephant```

"""

Using ChatGPT

Let's put what we've learned together and build a ChatGPT query to ask questions about the Women's World Cup. Here's the code using the BBC article.

world = """

Lauren James produced a sensational individual 

performance as England entertained to sweep aside 

China and book their place in the last 16 of the 

Women's World Cup as group winners.


It was a display worthy of their status as European 

champions and James once again lit the stage alight 

in Adelaide with two sensational goals and three assists.


The 13,497 in attendance were treated to a masterclass 

from Chelsea's James, who announced her arrival at the 

World Cup with the match-winner against Denmark on Friday.


She helped England get off to the perfect start when 

she teed up Alessia Russo for the opener, and 

later slipped the ball through to Lauren Hemp to 

coolly place it into the bottom corner.


It was largely one-way traffic as England dominated 

and overwhelmed, James striking it first time into 

the corner from the edge of the box to make it 3-0 

before another stunning finish was ruled out by video 

assistant referee (VAR) for offside in the build-up.

China knew they were heading out of the tournament 

unless they responded, so they came out with more 

aggression in the second half, unnerving England 

slightly when Shuang Wang scored from the penalty 

spot after VAR picked up a handball by defender 

Lucy Bronze.


But James was not done yet - she volleyed Jess Carter's 

deep cross past helpless goalkeeper Yu Zhu for 

England's fourth before substitute Chloe Kelly and 

striker Rachel Daly joined the party.


England, who had quietly gone about their business 

in the group stages, will have raised eyebrows with 

this performance before their last-16 match against 

Nigeria on Monday, which will be shown live on 

BBC One at 08:30 BST.


China are out of the competition after Denmark beat 

Haiti to finish in second place in Group D.


England prove worth without Walsh


Manager Sarina Wiegman kept everyone guessing when 

she named her starting XI, with England fans 

anxiously waiting to see how they would set up 

without injured midfielder Keira Walsh.

Wiegman's response was to unleash England's attacking 

talent on a China side who struggled to match them 

in physicality, intensity and sharpness.


James oozed magic and unpredictability, Hemp used her 

pace to test China's defence and captain Millie Bright 

was ferocious in her tackling, winning the ball back 

on countless occasions.


After nudging past Haiti and Denmark with fairly 

underwhelming 1-0 wins, England were keen to impose 

themselves from the start. Although China had chances 

in the second half, they were always second best.


Goalkeeper Mary Earps will be disappointed not to keep 

a clean sheet, but she made two smart saves to deny 

Chen Qiaozhu.


While England are yet to meet a side ranked inside 

the world's top 10 at the tournament, this will help 

quieten doubts that they might struggle without the 

instrumental Walsh.


"We're really growing into the tournament now," said 

captain Bright. "We got a lot of criticism in the first 

two games but we were not concerned at all.


"It's unbelievable to be in the same team as 

[the youngsters]. It feels ridiculous and I'm quite 

proud. Players feeling like they can express themselves 

on the pitch is what we want."


James given standing ovation


The name on everyone's lips following England's win 

over Denmark was 'Lauren James', and those leaving 

Adelaide on Tuesday evening will struggle to forget 

her performance against China any time soon.


She punished China for the space they allowed her on 

the edge of the box in the first half and could have 

had a hat-trick were it not for the intervention of VAR.

Greeted on the touchline by a grinning Wiegman, 

James was substituted with time to spare in the second 

half and went off to a standing ovation from large 

sections of the stadium.


"She's special - a very special player for us and 

for women's football in general," said Kelly. "She's 

a special talent and the future is bright."


She became only the third player on record (since 2011) 

to be directly involved in five goals in a Women's 

World Cup game.


With competition for attacking places in England's 

starting XI extremely high, James has proven she is 

far too good to leave out of the side and is quickly 

becoming a star at this tournament at the age of 21.

"""

prompt = f"""

In the text indicated by three back ticks answer the \

following questions and output your answer as JSON \

using the key names indicated by the word key_name" \

1) What sport was being played? key_name="sport" \

2) What competition was it? key_name="competition" \

3) What teams were playing? key_name = "teams" \

4) Which team won? key_name = "winner" \

5) What was the final score? key_name = "score" \

6) How many people attended the match? key_name = "attendance" \

```{world}```

"""

messages = [{"role": "user", "content": prompt}]

response = (openai

            .ChatCompletion

            .create(model=model,

                    messages=messages,

                    temperature=0)

           )

print(response.choices[0].message["content"])


Here are the results this code produces:

{

  "sport": "Football",

  "competition": "Women's World Cup",

  "teams": "England and China",

  "winner": "England",

  "score": "England 5 - China 1",

  "attendance": 13497

}

This is mostly right, but not quite. The score was actually 6-1. Even worse, the results are very sensitive to the text layout; changing line breaks changes the score.

I ran the same query, but with the Guardian article instead and here's what I got:

{

  "sport": "football",

  "competition": "World Cup",

  "teams": "England and China",

  "winner": "England",

  "score": "6-1",

  "attendance": null

}

With a better prompt, it might be possible to get better consistency and remove some of the formatting inconsistencies. By analyzing multiple articles on the same event, it may be possible to increase the accuracy still further.

Hallucinations

Sometimes ChatGPT gets it very wrong and supplies wildly wrong answers. We've seen a little of that with its analysis of the World Cup game, it wrongly inferred a score of 5-1 when it should have been 6-1. But ChatGPT can get it wrong in much worse ways.

I ran the queries above with text from the BBC and The Guardian. What if I ran the query with no text at all? Here's what I get when there's no text at all to analyze.

{

  "sport": "football",

  "competition": "World Cup",

  "teams": ["France", "Croatia"],

  "winner": "France",

  "score": "4-2",

  "attendance": "80,000"

}

Which is completely made up, hence the term hallucination.

Prompt engineering part 2

Let's go back to my elephant example from earlier and write it this way:

prompt = f"""

In the following text, "The kind old elephant", 

how often does the letter e occur

"""


model="gpt-3.5-turbo"

messages = [{"role": "user", "content": prompt}]


response = (openai

            .ChatCompletion

            .create(model=model,

                    messages=messages,

                    temperature=0)

           )

print(response.choices[0].message["content"])

Here's what the code returns:

In the phrase "The kind old elephant," the letter "e" occurs 4 times.

Which is clearly wrong.

In this case, the problem is the placement of the text to be analyzed. Moving the text to the end of the prompt and being more explicit about what should be returned helps. Even simply adding the phrase "Give your answer as JSON" to the prompt fixes the issue.

This is why the precise form of the prompt you use is critical and why it may take several iterations to get it right.

What does all this mean?

The promise of ChatGPT

It is possible to analyze text and extract information from it. This is huge and transformative for business. Here are just a few of the things that are possible:

  • Press clippings automation.
  • Extraction of information from bills of lading.
  • Automated analysis of SEC filings.
  • Automated analysis of company formation documents.
  • Entity extraction.

We haven't even touched on some of the many other things ChatGPT can do, for example:

  • Language translation.
  • Summarization.
  • Report writing.

How to deliver on that promise

As I've shown in this blog post, the art is in prompt engineering. To get it right, you need to invest a good deal of time in getting your prompts just right and you need to test out your prompts on a wide range of inputs. The good news is, this isn't rocket science.

The skills you need

The biggest change ChatGPT introduces is skill levels. Previously, doing this kind of analysis required a good grasp of theory and underlying libraries. It took quite a lot of effort to build a system to analyze text. Not any more; the skill level has just dropped precipitously; previously, you needed a Ph.D., now you don't. Now it's all about formulating a good prompt and that's something a good analyst can do really well.

The bottom line

ChatGPT, and LLMs in general, are transformative. Any business that relies on information must know how to use them.

Tuesday, July 25, 2023

ChatGPT and code generation: be careful

I've heard bold pronouncements that Large Language Models (LLMs), and ChatGPT in particular, will greatly speed up software development with all kinds of consequences. Most of these pronouncements seem to come from 'armchair generals' who are a long way from writing code. I'm going to chime in with my real-world experiences and give you a more realistic view.

D J Shin, CC BY-SA 3.0, via Wikimedia Commons

I've used ChatGPT to generate Python code to solve some small-scale problems. These are things like using an API or doing some simple statistical analysis or chart plotting. Recently, I've branched out to more complex problems, which is where its limitations become more obvious.

In my experience, ChatGPT is excellent for generating code for small problems. It might not solve the problem completely, but it will automate most of the boring pieces and give you a good platform to get going. The code it generates is good with some exceptions. It doesn't generate doc strings for functions, it's light on comments, and it doesn't always follow PEP8 layout, but it does lay out its code clearly and it uses functions well. The supporting documentation it creates is great, in fact, it's much better than the documentation most humans produce. 

For larger problems, it falls down, sometimes badly. I gave it a brief to create code to demonstrate the Central Limit Theorem (CLT) using Bokeh charts with several underlying distributions. Part of the brief it did well and it clearly understood how to demonstrate the CLT, but there were problems I had to fix. It generated code for an out-of-date version of Bokeh which required some digging and coding to fix; this could have been cured by simply adding comments about the versions of libraries it was using. It also chose some wrong variable names (it used the reverse of what I would have chosen). More importantly, it did some weird and wrong things with the data at the end of the process, I spotted its mistake in a few minutes and spent 30 minutes rewriting code to correct it. I had similar problems with other longer briefs I gave ChatGPT.

Obviously, the problems I encountered could have been due to incomplete or ambiguous briefs. A solution might have been to spend time refining my brief until it gave me the code I wanted, but that may have taken some time. Which would have been faster, writing new detailed briefs or fixing code that was only a bit wrong? 

More worryingly,  I spotted what was wrong because I knew the output I expected. What if this had been a new problem where I didn't know what the result should look like?

After playing around with ChatGPT for a while, here are my takeaways:

  • ChatGPT code generation is about the level of a good junior programmer.
  • You should use it as a productivity boost to automate the boring bits of coding, a jump start.
  • Never trust the code and always check what it's doing. Don't use it when you don't know what the result should look like.

Obviously, this is ChatGPT today and the technology isn't standing still. I would expect future versions to improve on commenting etc. What will be harder is the brief. The problem here isn't the LLM, it's with the person writing the brief. English is a very imperfect language for detailed specifications which means we're stuck with ambiguities. I might write what I think is the perfect brief, only to find out I've been imprecise or ambiguous. Technology change is unlikely to fix this problem in the short term.

Of course, other industries have gone through similar disruptive changes in the past. The advent of CAD/CAM didn't mean the end of factory work, it raised productivity at the expense of requiring a higher skill set. The workers with the higher skillset gained, and those with a lesser skillset lost out.

In my view, here's how things are likely to evolve. LLMs will become standard tools to aid data scientists and software developers. They'll be productivity boosters that will require a high skill set to use. The people most negatively impacted will be junior staff or the less skilled, the people who gain the most will be those with experience and a high skill level.