Tech Explainer: How does generative AI generate?

Generative AI systems such as ChatGPT are grabbing the headlines. Find out how this super-smart technology actually works. 

  • April 21, 2023 | Author: KJ Jacoby
Learn More about this topic

Article Key

Generative AI refers to a type of artificial intelligence that can create or generate new content, such as images, music, and text, based on patterns learned from large amounts of data. Generative AI models are designed to learn the underlying distribution of a dataset and then use this knowledge to generate new samples that are similar to those in the original dataset.

This emerging tech is well on its way to becoming a constant presence in everyday life. In fact, the preceding paragraph was generated by ChatGPT. Did you notice?

The growth of newly minted household names like ChatGPT may be novel, headline-grabbing news today. But soon they should be so commonplace, they’ll hardly garner a sidebar in Wired magazine.

So, if the AI bots are here to stay, what makes them tick?

Generating intelligence

Generative AI—the AI stands for artificial intelligence, but you knew that already—lets a user generate content quickly by providing various types of inputs. These inputs can include text, sounds, images, animations and 3D models. Those are also the possible forms of outputs.

Data scientists have been working on generative AI since the early 1960s. That’s when Joseph Weizenbaum created the Eliza chat-bot. A bot is a software application that runs automated tasks, usually in a way that simulates human activity.

Eliza, considered the world’s first generative AI, was programmed to respond to human statements almost like a therapist. However, the program did not actually understand what was being said.

Since then, we’ve come a long way. Today’s modern generative AI feeds on large language models (LLMs) that bear only a glimmer of resemblance to the relative simplicity of early chatbots. These LLMs contain billions, even trillions, of parameters, the aggregate of which provides limitless permutations that enable AI models to learn and grow.

AI graphic generators like the popular DALL-E or Fotor can produce images based on small amounts of text. Type “red tuba on a rowboat on Lake Michigan,” and voila! an image appears in seconds.

Beneath the surface

The human interface of an AI bot such as ChatGPT may be simple, but the technical underpinnings are complex. The process of parsing, learning from and responding to our input is so resource-intensive, it requires powerful computers, often churning incredible amounts of data 24x7.

These computers use graphical processing units (GPUs) to power neural networks tasked with identifying patterns and structures within existing data and using it to generate original content.

GPUs are particularly good at this task because they can contain thousands of cores. Each individual core can complete only one task at a time. But the core can work simultaneously with all the other cores in the GPU to collectively process huge data sets.

How generative AI generates...stuff

Today’s data scientists rely on multiple generative AI models. These models can be either deployed discreetly or combined to create new models greater—and more powerful—than the sum of their parts.

Here are the three most common AI models in use today:

  • Diffusion models use a two-step process: forward diffusion and reverse diffusion. Forward diffusion adds noise to training data; reverse diffusion removes that noise to reconstruct data samples. This learning process allows the AI to generate new data that, while similar to the original data, also includes unique variations.
    • For instance, to create a realistic image, a diffusion model can take in a random set of pixels and gradually refine them. It’s similar to the way a photograph shot on film develops in the darkroom, becoming clearer and more defined over time.
  • Variational autoencoders (VAEs) use two neural networks, the encoder and the decoder. The encoder creates new versions of the input data, keeping only the information necessary to perform the decoding process. Combining the two processes teaches the AI how to create simple, efficient data and generate novel output.
    • If you want to create, say, novel images of human faces, you could show the AI an original set of faces; then the VAE would learn their underlying patterns and structures. The VAE would then use that information to create new faces that look like they belong with the originals.
  • Generative adversarial networks (GANs) were the most commonly used model until diffusion models came along. A GAN plays two neural networks against each other. The first network, called the generator, creates data and tries to trick the second network, called the discriminator, into believing that data came from the real world. As this feedback loop continues, both networks learn from their experiences and get better at their jobs.
    • Over time, the generator can become so good at fooling the discriminator that it is finally able to create novel texts, audio, images, etc., that can also trick humans into believing they were created by another human.

Words, words, words

It’s also important to understand how generative AI forms word relationships. In the case of a large language model such as ChatGPT, the AI includes a transformer. This is a mechanism that provides a larger context for each individual element of input and output, such as words, graphics and formulas.

The transformer does this by using an encoder to determine the semantics and position of, say, a word in a sentence. It then employs a decoder to derive the context of each word and generate the output.

This method allows generative AI to connect words, concepts and other types of input, even if the connections must be made between elements that are separated by large groups of unrelated data. In this way, the AI interprets and produces the familiar structure of human speech.

The future of generative AI

When discussing the future of these AI models and how they’ll impact our society, two words continually get mentioned: learning and disruption.

It’s important to remember that these AI systems spend every second of every day learning from their experiences, growing more intelligent and powerful. That’s where the term machine learning (ML) comes into play.

This type of learning has the potential to upend entire industries, catalyze wild economic fluctuations, and take on many jobs now done by humans.

On the bright side, AI may also become smart enough to help us cure cancer and reverse climate change. And if AI has to take our jobs, perhaps it can also figure out a way to provide income for all.


Related Content