An AI Wizard of Words


A look at using OpenAI's Generative Pretrained Transformer 2 (GPT-2) to generate text.

It's probably fair to say that there's more than one person out there who is worried about some version of artificial intelligence, or AI, possibly in a robot body of some kind, taking people's jobs. Anything that is repetitive or easily described is considered fair game for a robot, so driving a car or working in a factory is fair game.

Until recently, we could tell ourselves that people like yours truly—the writers and those who create things using some form of creativity—were more or less immune to the march of the machines. Then came GPT-2, which stands for Generative Pretrained Transformer 2. I think you'll agree, that isn't the sexiest name imaginable for a civilization-ending text bot. And since it's version 2, I imagine that like Star Trek's M-5 computer, perhaps GPT-1 wasn't entirely successful. That would be the original series episode titled, "The Ultimate Computer", if you want to check it out.

So what does the name "GPT-2" stand for? Well, "generative" means pretty much what it sounds like. The program generates text based on a predictive model, much like your phone suggests the next word as you type. The "pretrained" part is also quite obvious in that the model released by OpenAI has been built and fine-tuned for a specific purpose. The last word, "Transformer", refers to the "transformer architecture", which is a neural network design architecture suited for understanding language. If you want to dig deeper into that last one, I've included a link from a Google AI blog that compares it to other machine learning architecture (see Resources).

On February 14, 2019, Valentine's Day, OpenAI released GPT-2 with a warning:

Our model, called GPT-2 (a successor to GPT), was trained simply to predict the next word in 40GB of Internet text. Due to our concerns about malicious applications of the technology, we are not releasing the trained model. As an experiment in responsible disclosure, we are instead releasing a much smaller model for researchers to experiment with, as well as a technical paper.

I've included a link to the blog in the Resources section at the end of this article. It's worth reading partly because it demonstrates a sample of what this software is capable of using the full model (see Figure 1 for a sample). We already have a problem with human-generated fake news; imagine a tireless machine capable of churning out vast quantities of news and posting it all over the internet, and you start to get a feel for the dangers. For that reason, OpenAI released a much smaller model to demonstrate its capabilities and to engage researchers and developers.