AI Meme Engineer (1/4) – Building a Fully Automated Meme Generator

Hi and welcome to this course on building a fully automated Meme generator using Python. Our meme generator will have a frontend interface where the user can type any subject and at the click of a button memes will come rolling out. The output will be fully created memes with the text already embedded inside the image at the appropriate position and font size.

Besides creating something really cool and fun, and becoming a meme-master, we will also learn a lot of stuff about Python, programmatic image editing, calling ChatGPT in JSON mode and writing good system prompts, Streamlit, and more.

The first part may be a bit dry as we have some project setup work to do but stick with me, as the real fun starts immediately after. 🙂 So without further ado, let’s get started!

I’ll try not to insert too many memes into the course, but it’s hard to resist now that I have an automated generator!

Getting Started

The only things you’ll need to follow along are a code editor and Python installed on your machine. We’ll be using a few libraries, but we’ll install and discuss them as we go along. I’ll be using VSCode, but you can use any editor of your choice. As for Python, any reasonably recent version will work fine.

First get started by creating a project folder anywhere you like, and then open the folder in your code editor of choice. I’ll name my project folder Meme_Gen:

📁 Meme_Gen

The first thing we’ll need in order to be able to generate memes is some meme template images. I’ve gone ahead and selected 11 of the most popular and widely used meme templates for this course, but you will easily be able to add more yourself after the course is done.

So download the provided templates folder and make sure to place it inside your project folder:

📁 Meme_Gen
    📁 templates
        🖼 But-Thats-None-Of-My-Business.jpg
        🖼 Disaster-Girl.jpg
        🖼 Distracted-Boyfriend.jpg
        🖼 Drake-Hotline-Bling.jpg
        🖼 etc...

If you look inside you’ll find the 11 template images without any text on them. You’ll probably recognize most of them!

Preparing the meme data

In order to get this project to work we will need to create an interface that will allow the user to generate memes. In order to do that, we need to have some kind of function that will allow us to embed our meme text into the meme template images. In order to run this function we need to actually have a meme text to embed, which brings us to the first task of this course: preparing the meme data.

We will have to give ChatGPT strict limitations about what it’s supposed to do, which memes it can choose from, and how to use them. We’ll also need to have exact details for our text-into-image embedding function letting it know where the template image is located, what font to use, whether to outline the text, and most importantly exactly where to place which piece of text.

So for each meme, we’ll need an object or some record that holds all this information. I’ve called this the meme data here, and where for a big project we’d use a database for this, I want to keep it simple here, so I have provided a JSON file that contains all the data needed for every single meme template chosen.

As it was quite repetitive and boring to write out all this data and get the appropriate pixel coordinates for each text box on every image, I’ve done it for you. You can find the meme_data.json file in the course materials. Make sure to download it and place it in your project folder (but outside the templates folder):

📁 Meme_Gen
    📁 templates
        🖼 But-Thats-None-Of-My-Business.jpg
        🖼 Disaster-Girl.jpg
        🖼 Distracted-Boyfriend.jpg
        🖼 Drake-Hotline-Bling.jpg
        🖼 etc...
    📄 meme_data.json

I created the data for you, but I will explain exactly how I did it for two reasons:

I hate tutorials that just say “Here, just copy this” without explaining how stuff works, beginners should also be able to follow along and have a good time.
You may very well want to add your own favorite memes to the generator after the course is done, you’ll know exactly how to do it.

If you open the meme_data.json file you’ll see that it is basically just a list with 11 entries that look like this:

  {
    "id": 0,
    "name": "Drake Hotline Bling Meme",
    "alternative_names": ["drakeposting", "drakepost", "drake like dislike"],
    "file_path": "Drake-Hotline-Bling.jpg",
    "font_path": "impact.ttf",
    "text_color": "white",
    "text_stroke": true,
    "usage_instructions": "The Drake Hotline Bling Meme is used to humorously express preference or approval for one thing over another. It features two panels with rapper Drake. In the first panel, Drake is rejecting something, while in the second panel, he is showing approval for something else. Users overlay text or images to depict contrasting options or decisions. Make sure the first sentence is the thing to be rejected and the second sentence is the thing to be approved.",
    "number_of_text_fields": 2,
    "text_coordinates_xy_wh": [
      [620, 20, 560, 560],
      [620, 620, 560, 560]
    ],
    "example_output": ["Monday mornings", "Friday afternoons"]
  },

So let’s go over all these key-value pairs:

id: A unique identifier for the meme, we’ll need this for ChatGPT to tell us which meme it wants to use with the text it is providing for us. I simply used increasing numbers from 0 to 10.
name: The most well-known name of the meme, for clarity.
alternative_names: Other names the meme might be known by, for ChatGPT to recognize it, just to be extra safe.
file_path: The exact name of the image file located inside the templates folder, we’ll need this for the image editing function which overlays the text.
font_path: The font file to use for the text to be overlaid. I have chosen three commonly used and easy-to-find meme fonts, “Impact”, “Arial”, and “Comic Sans”. We will download and prepare these fonts in a moment.
text_color: The color of the text to be overlaid, as some memes don’t work well with a white font, like the popular UNO Draw 25 meme.
text_stroke: Whether to outline the text or not, we’ll use an outline for most but not all memes.
usage_instructions: A description of the meme and what kind of feeling is generally expressed using the meme. It ends with quite specific instructions of what each sentence should contain and what order the sentences should be in. This is all for ChatGPT as we need to be very, very, clear about everything.
For example, If we generate the Distracted Boyfriend meme but ChatGPT changes the order of the three pieces of text from "New shiny idea", "Me", "Current project deadline", to "Me", "New shiny idea", "Current project deadline", the meme will not make sense as we will be appending the wrong text to the wrong part of the image.
number_of_text_fields: The number of text fields that need to be overlaid on the image, both for the image editing function and for ChatGPT to know how many text fields to provide to us. We have memes with 2, 3, and 4 text fields.
text_coordinates_xy_wh: The pixel coordinates and width and height of the text boxes on the image. This tells the image editing function exactly where to place the text. The coordinates are in the format [x, y, width, height] and are relative to the top left corner of the image. Each list entry represents one of the text boxes.
I extracted these coordinates using image editing software which showed me the box size as I dragged around a selection box. You can also guess these based on the overall image size. This part is a bit of a pain but with trial-and-error you can find the perfect numbers. A concrete example for clarity:

example_output: This final key is for ChatGPT again, a concrete and exact example of what the output for this particular meme should look like and which order the sentences should have. Make sure that both here and in the earlier usage_instructions the sentences are in the same order as the text fields in the text_coordinates_xy_wh list you created.

Aren’t you glad I did all this image coordinate trial-and-error stuff for you? 😅 For bonus points, add your own favorite meme templates after we finish the course!

Getting our fonts

One last setup thing to do before we can really start coding. Create a folder named fonts in your project base directory:

📁 Meme_Gen
    📁 fonts
    📁 templates
        🖼 (all the images in here...)
    📄 meme_data.json

Now we need to download the three fonts we’ll be using. Let’s start with the Impact font. You can download it from here or many other websites. It is probably also in your fonts folder on your system already.

If you download the file from the website above, you can just take the top .ttf file and place it in the fonts folder you just created.

You can download the Arial font from here, and the Comic Sans font from here. Place both of these .ttf files in the fonts folder as well. For the Arial font download, I picked the very top file from the .zip archive, named ARIAL.TTF, and for the Comic Sans font download, I picked the third file in the archive named ComicSansMS3.ttf. You can also pick or add other fonts if you like, or download the fonts from other sources if any of those website links is no longer available.

You should end up with the following:

📁 Meme_Gen
    📁 fonts
        📄 ARIAL.TTF
        📄 ComicSansMS3.ttf
        📄 impact.ttf
    📁 templates
        🖼 (all the images in here...)
    📄 meme_data.json

Note: If you added more fonts or are using different names for your fonts, either rename your font files to the exact same as mine (case-sensitive) or change the font_path key in the meme_data.json file to match the name of your font files.

While we’re at it, let’s create one more folder for the image output of our memes. Create a folder named output in your project base directory:

📁 Meme_Gen
    📁 fonts
        📄 ARIAL.TTF
        📄 ComicSansMS3.ttf
        📄 impact.ttf
    📁 output
    📁 templates
        🖼 (all the images in here...)
    📄 meme_data.json

We don’t need to do anything with this folder or put anything inside it. It will soon be filled to the brim with memes once our generator is up and running!

Loading up our meme data

Now it’s time to get into some coding! The first thing we’ll need is some way to interact with our meme data in Python. Go ahead and create a new Python file in your project folder named load_meme_data.py:

📁 Meme_Gen
    📁 fonts
        📄 ARIAL.TTF
        📄 ComicSansMS3.ttf
        📄 impact.ttf
    📁 output
    📁 templates
        🖼 (all the images in here...)
    📄 load_meme_data.py  --> create this file
    📄 meme_data.json

In this file, we’re going to be creating both a datatype for our meme data and a function to load the meme data from the JSON file. So open up load_meme_data.py in your code editor and let’s get started by adding the following imports:

from typing import TypedDict
import json

We’re using TypedDict from the typing module to create a type for our meme data. If you’re not sure how this works, you’ll see in a second, but the basic idea is we’re going to be setting up a structure for our meme data that will help Python understand exactly what kind of data we’re working with.

The first thing we’ll do is define the overall structure of our meme data. We’re going to create a class that will act as a ‘type hint’ to declare our data structure. Add the following:

class MemeData(TypedDict):
    id: int
    name: str
    alternative_names: list[str]
    file_path: str
    font_path: str
    text_color: str
    text_stroke: bool
    usage_instructions: str
    number_of_text_fields: int
    text_coordinates_xy_wh: list[list[int]]    
    example_output: list[str]

We define a new class MemeData that inherits from TypedDict, which means that we can basically create a dictionary-style object, but we get to specify the exact types for each key in the dictionary.

We specify that the id key should be an integer and the name key should be a string (str). The next entry is a bit special, alternative_names is a list of strings, so we use list[str] to specify that. Now we have file_path, font_path, and text_color, which are all strings.

text_stroke is a boolean so either True or False, usage_instructions is a string, number_of_text_fields is an integer, and text_coordinates_xy_wh is a list which again contains lists that contain integers:

Finally, we have example_output, which is a list of strings. This structure will help us keep our meme data organized and make it easier to work within our code.

Compare this MemeData type to the JSON data we have in meme_data.json and you’ll see that they match up perfectly. Now the data will be much easier to work with as Python will know exactly what to expect, and can even give us completion suggestions:

What’s more, if we ever make a mistake Python will warn us immediately, which makes it much harder to introduce bugs into our code:

The next thing we’ll need is a function to load up our meme data from the JSON file and convert it into a list of MemeData objects, so we can feed it into the rest of our Python code to work with. Add the following function to your load_meme_data.py file below the MemeData class:

def load_meme_data() -> list[MemeData]:
    with open("meme_data.json", "r") as file:
        meme_data: list[dict] = json.load(file)
    return [MemeData(**meme) for meme in meme_data]

We define a function named load_meme_data that takes no input arguments () and will return -> a list of MemeData objects. In the next line, we open the meme_data.json file in read mode "r" and assign it to the variable file.

with open is a context manager which means that any indented code block that follows will be executed with the file open (and the file will be closed automatically when the indent is over). We then use json.load to load the JSON data from the file and assign it to the variable meme_data. At this point meme_data is a list of just standard Python dictionaries as indicated by the type-hint : list[dict].

For the return statement, we’re going to create a new list in place using a [list comprehension]. This code will create a new list in place, and inside it will iterate over every single meme in the meme_data list. What it will do is create a new MemeData object for each meme in the meme_data list by passing in (meme**). The ** operator basically just says “Take all the key-value pairs in this dictionary and pass them into the MemeData class” all at once. Now we truly have a list of MemeData objects which is returned from the function.

This is great for our own code and especially the image editor that will overlay the text onto the images to work with, but we will also have to feed this meme_data to ChatGPT when we ask it to generate memes for us. AI Chat models only work with textual data, so let’s create a separate load function that will load the meme data but return it as a simple single string (not a valid object or dictionary) so we can feed this ‘flat string’ meme data to ChatGPT.

Below the load_meme_data function, add the load_meme_data_flat_string function:

def load_meme_data_flat_string() -> str:
    with open("meme_data.json", "r") as file:
        meme_data: str = file.read()
    return meme_data

This function is very similar to the load_meme_data function, but we promise to return one simple string value -> str instead of a list of MemeData objects. We open the meme_data.json file in read mode "r" and assign it to the variable file. We then use file.read() to read the entire contents of the file into the variable meme_data, literally just reading the text value of the file without parsing it into a valid object.

Testing our meme data loaders

Ok, before we continue to the next part, let’s double-check if we did a good job with our load_meme_data.py file. At the bottom of the file, add the following code:

if __name__ == "__main__":
    meme_data_loaded = load_meme_data()
    first_item = meme_data_loaded[0]
    print(type(first_item))
    print(first_item)

    string_data = load_meme_data_flat_string()
    print(type(string_data))

The line if __name__ == "__main__": is a common Python idiom used to determine whether a script is being run directly or being imported into another script. When a Python file is run directly, the special variable __name__ is set to "__main__". This means that the code block under this if statement will only execute if the script is run directly, not if it is imported elsewhere.

This allows you to include test code that should only run when the file is executed on its own. Here we use this to print some quick output for testing purposes. If the file is imported as a module in another script, the code inside this block will be skipped, preventing it from running unintentionally.

Inside this ‘test block’ we simply try out our meme data loading function, selecting the first meme from the loaded data and printing the type of the data to the console, as well as the actual data itself. We then load the meme data as a flat string and print the type of that data as well to make sure it is a string.

Now go ahead and run this Python file and you should see the following output in your terminal window:

<class 'dict'>
{'id': 0, 'name': 'Drake Hotline Bling Meme', 'alternative_names': ['drakeposting', 'drakepost', 'drake like dislike'], 'file_path': 'Drake-Hotline-Bling.jpg', 'font_path': 'impact.ttf', 'text_color': 'white', 'text_stroke': True, 'usage_instructions': 'The Drake Hotline Bling Meme is used to...', 'number_of_text_fields': 2, 'text_coordinates_xy_wh': [[620, 20, 560, 560], [620, 620, 560, 560]], 'example_output': ['Monday mornings', 'Friday afternoons']}
<class 'str'>

We first see the class of 'dict' which is correct for a TypedDict based object, and then the actual data of the first meme. We then see the class of 'str' so our second function is correct in providing a flat string of the meme data.

So now that we have our data and two separate ways to load up the data, we’re ready to move on to the next part of the tutorial series, where we’ll be getting started with ChatGPT to generate the actual meme texts for us. I’ll see you in the next part!

Getting Started

Preparing the meme data

Getting our fonts

Loading up our meme data

Testing our meme data loaders

Leave a Comment Cancel Reply