OpenAI Video Creator (3/6): OpenAI’s Image Generation API

Have you ever dreamed of creating stunning images for your video projects with just a few keystrokes? Whether you’re a seasoned content creator or just getting started, OpenAI’s image generation capabilities through DALI 2 or DALI 3 can help you bring your visions to life. OpenAI offers robust image generation capabilities through its API. You can find detailed information about this feature in the Docs section of OpenAI’s platform, under the “Image Generation” option. Here are some key points: Imagine generating high-quality images from simple text prompts, editing existing ones, and even producing variations—all within minutes. Let’s dive into how you can leverage this powerful tool for your next project!

Why OpenAI’s Image Generation?

OpenAI’s image generation is a game-changer. It offers an intuitive way to create visuals by translating text prompts into compelling images. You can:

Generate images from text prompts: Simply describe what you need, and watch as the AI brings it to life.

Edit existing images: Make adjustments to current visuals to better fit your needs.

Produce variations: Create multiple versions of an image for A/B testing or creative diversity.

Getting Started: From Text Prompt to Stunning Visuals

Let’s walk through the process using a simple example—creating an image of an axolotl swimming in clean water.

Set Up Your Environment:

from dotenv import load_dotenv
from openai import OpenAI
import os
import requests
load_dotenv()
client = OpenAI()

Ensure you have access to OpenAI’s API. Install necessary libraries like os and requests.

Generate the Image:

Use the provided code snippet to generate your image. Here’s a quick example:

image_type="illustration"

response = client.images.generate(
  model="dall-e-3",
  prompt=f'{image_type} of {"Cut to a shot of a pink and an albino axolotl."}',
  size="1024x1024",
  quality="standard",
  n=1,
  
)

This code is using the OpenAI API to generate an illustration of two axolotls (a pink one and an albino one) using the DALL-E 3 model. The image will be 1024×1024 pixels in size, of standard quality, and the API will return one image. Let’s break down this code snippet:

1. image_type="illustration"

This line defines a variable `image_type` and sets it to “illustration”. This will be used in the prompt to specify the style of image we want to generate.

2. response = client.images.generate(

This line starts a call to the OpenAI API’s image generation function. The result will be stored in the response variable.

3. model="dall-e-3"

This specifies that we want to use the DALL-E 3 model for image generation.

4. prompt=f'{image_type} of {"Cut to a shot of a pink and an albino axolotl."}',

This is the prompt for image generation. It uses an f-string to combine the `image_type` (“illustration”) with the specific description. The full prompt will be “illustration of Cut to a shot of a pink and an albino axolotl.”

5. size="1024x1024",

This sets the size of the generated image to 1024×1024 pixels.

6. quality="standard",

This specifies the quality of the image. “standard” is used here, but there might be other options available depending on the API version.

7. n=1,

This parameter tells the API to generate 1 image.

Save the file in Local Directory

image_url = response.data[0].url
local_dir = "./video_images"
filename = "vid_image_6.png"
local_path = os.path.join(local_dir, filename)
image_data = requests.get(image_url).content
with open(local_path, "wb") as f:
    f.write(image_data)
print(f"Image saved to {local_path}")

Here’s an explanation of the code to save an image to a local directory:

1. image_url = response.data[0].url: This line assumes you have a response object with image URL data. It extracts the URL of the image you want to save.

2. local_dir = "./video_images": This specifies the local directory where you want to save the image. In this case, it’s a subdirectory called “video_images” in the current working directory.

3. filename = "vid_image_6.png": This sets the filename for the saved image. You can customize this as needed.

4. local_path = os.path.join(local_dir, filename): This combines the directory and filename to create the full path where the image will be saved.

5. image_data = requests.get(image_url).content: This sends a GET request to the image URL and retrieves the image data.

6. The with block opens a file at the specified local_path in binary write mode ("wb"). It then writes the image data to this file, effectively saving the image.

7. Finally, it prints a message confirming where the image was saved.

The objective of this code is to download an image from a URL and save it to a local directory on the user’s computer.

Making the Most of Variations

One of the most exciting features is the ability to generate variations of saved images. This is perfect for creating different styles or refining the initial output.

response = client.images.create_variation(
  model="dall-e-3",
  image=open("./video_images/vid_image2.png", "rb"),
  n=1,
  size="1024x1024"
)

Now, let’s break this down line by line:

1. response = client.images.create_variation(...): This line calls the create_variation method from the OpenAI API client, specifically for generating different variations of an existing image. The result is stored in the response variable.

2. model="dall-e-3": This parameter specifies that we want to use the dall-e-3 model for our image variation task.

3. image=open("./video_images/vid_image2.png", "rb"): Here, we’re opening an existing image file (vid_image2.png) in binary read mode ("rb"). This is the image that dall-e-3 will use as a base to create variations.

4. n=1: This parameter tells the API to generate one variation of the input image.

5. size="1024x1024": This sets the size of the output image to 1024×1024 pixels.

By running this code, you’re essentially asking dall-e-3 to take your input image and create a new, similar but distinct image based on it. It’s like asking an AI artist to riff on your original piece!

Remember, when working with the OpenAI API, you’ll need to have your API key set up and the appropriate client library installed. Also, be mindful of usage limits and costs associated with API calls.

Exploring Different Image Types

Whether you need photos, cartoons, or illustrations, OpenAI’s image generation adapts to your requirements. Define your image type in the prompt, adjust filenames, and run the code to save the new images.

image_type="illustration"

Next Steps: Adding Voiceovers

The current phase focuses on image generation, but the project aims to incorporate voiceovers next. Stay tuned for our next phase, where we’ll explore adding voiceovers to our video projects. Happy creating!