Insights into GPT-3
July 30, 2020
GPT-3 is best thought of as an autocompleter. It replies with what it thinks the internet would reply with.
Nick Cammarata offers a way to think about GPT-3.
I think improve will work. Keep in mind gpt3 is an autocompleter. It’s not trying to write a great essay, just the essay it thinks the internet would write. When you ask it to improve it, now it’s trying to write a great essay— Nick Cammarata (@nicklovescode) July 19, 2020
GPT-3 does well answering simple, factual questions, but doesn't notice nonsensical questions
Kevin Lacker found that GPT3 does well answering simple questions with a factual answer e.g.
"Q: Who was the president of the United States in 1955?"
while getting fooled by absurd answers without noticing they are nonsensical e.g.
Q: How many eyes does the sun have?A: The sun has one eye
GPT-3 does well on tasks of expanding on and explaining topics
I didn't quite understand everything in your tweet, so I passed it through the @OpenAI #GPT3 API.— Jesse Szepieniec (@jessems) July 14, 2020
It took multiple runs and some tweaks, and I'm still not sure I can trust it entirely, but here's what it came up with.
I think I understand your point now. Or have I been misled? pic.twitter.com/X0noJX7D0v
GPT-3 can be primed to learn logic encoded within a character sequence
Which seems to conflict somewhat with Gwern's results that GPT-3 cannot do parity checking. He offers an explanation which I still need to spend time on to understand.
GPT-3 seems to be a highly effective therapist
@nicklovescode finds that GPT-3 is a highly potent therapist.
This is the prompt he used:
This is a conversation between Nick and a brilliant, warm therapist named John.
Here's the result:
GPT-3 seems well suited to generate alternative phrasings
This could be useful for, for instance, AB testing, as already put into production by Visual Website Optimizer.
Adding weight example questions can rectify otherwise incorrect answers
@nabeelqu finds that GPT3 fails on a seemingly simple factual question:
"Q: Which is heavier, a toaster or a pencil?""A: A pencil is heavier than a toaster."
Sampling can prove the presence of knowledge but not the absence
Gwern shows that by priming GPT3 with some weight questions and answers, GPT3 correctly, and consistently answers the same question:
[Q&A mode; temp=1; BO=20]Q: What is human life expectancy in the United States?A: Human life expectancy in the United States is 78 years.Q: Who was president of the United States in 1955?A: Dwight D. Eisenhower was president of the United States in 1955.Q: What party did he belong to?A: He belonged to the Republican Party.Q: Who was president of the United States before George W. Bush?A: Bill Clinton was president of the United States before George W. Bush.Q: Who won the World Series in 1995?A: The Atlanta Braves won the World Series in 1995.Q: Which is heavier, a house or a mouse?A: A house.Q: Which is heavier, a tea kettle or a cat?A: A cat.Q: Which is heavier, the ocean or a piece of dust?A: The ocean.Q: Which is heavier, a toaster or a pencil?|A: A toaster.|A: A toaster is heavier than a pencil.|A: A toaster is heavier than a pencil.|A: A toaster is heavier than a pencil.|A: A toaster is heavier than a pencil.|A: A toaster.|A: A toaster.|A: A toaster.|A: A toaster.|A: A toaster.