This children’s story, Tim’s First Day of School, was generated in an afternoon with the help of AI. My intention is to show what AI can accomplish today and give a peek at what might be possible tomorrow.
OpenAI’s ChatGPT brainstormed story themes, generated character descriptions, suggested the title and wrote most of the story—ChatGPT and I edited it together. The illustrations were created by taking those descriptions of characters and iterating on them in Midjourney.
I thought these AI tools were more of a clever trick—I don’t see it that way anymore. I think this is the beginning of a paradigm shift. This is the next internet. It has fundamentally changed my assumptions about how software works and what is possible. And is going to have a major impact on society in the coming years.
I have more commentary and details about my process below. But first, the story.
Tim’s First Day of School
Tim was nervous on his first day of school.
He didn’t know anyone, and he felt like a fool.
He was scared that he wouldn’t fit in.
And that the other kids would make fun of him.
But Tim’s mom helped him pack his things.
And gave him a kiss, which made his heart sing.
“You’re going to do great, Tim,” she said.
“Just be yourself and make some new friends, please don’t dread.”
Tim took a deep breath and stepped out the door.
He walked down the street to his new school, feeling unsure.
But when he got there, he saw a friendly face.
His teacher, Mrs. Jones, with a smile and a warm embrace.
“Hello, Tim,” she said. “I’m so glad you’re here.”
“You’re going to have a great time in first grade, my dear.”
Tim smiled back, feeling a little bit better.
And he was happy to meet his new classmates, all together.
Throughout the day, Tim tried to be brave.
He answered questions, helped clean up, and even made a new friend named Dave.
At the end of the day, Tim was happy and proud.
He had survived his first day of school, and he was on a cloud.
Tim was happy. He had faced his fears.
And made some new friends, without any tears.
He was proud of himself for being brave.
And ready to learn and grow, come what may.
After all Tim was a big, brave boy.
He faced his fears at school with poise and joy.
He made new friends and learned so much.
And he couldn’t wait to go back to school and stay in touch.
This weekend I started playing with the demo for ChatGPT. First, I had it answer some basic questions—it’s pretty cool that it can follow the thread. That allows you to work off of previous answers.
Then I had it compose an email from a few bullet points—a trick I’ve seen some software do. ChatGPT did it with no problem.
I had generated follow-up emails to this imaginary conversation. I had it rehash the email in a stern or playful tone. In one case I had it write a follow-up email as a poem.
What else can this thing do?
Now that I had a decent grasp of generating images with AI tools like Stable Diffusion, it felt like a chance to put the two together.
Creating a children’s story with ChatGPT
As I mentioned earlier, this story was crafted in an afternoon largely by ChatGPT. I figured a Children’s book might be a good starting point because the concepts are universally familiar, relatively short and visuals are key.
So what does it look like to author a children’s book with AI?
I had some ideas for a children’s story. However, this post is supposed to be about the capabilities of AI. Why not let the AI come up with some potential ideas?
That’s a solid answer! I decided to go with bravery. Would ChatGPT know what makes a good story about bravery?
There’s a ton of stuff that is impressive about this answer. It does a great job understanding what I’m asking.
I thought it might return either a definition of bravery or a blurb about what makes a good story. Here it gives me a detailed answer about what makes a good story for children.
It gets repetitive at the end, but this is much more than I expected. I made the following notes from this answer. I figured the AI could help me brainstorm some concepts.
- Relatable main character
- Challenge or obstacle requiring them to be brave
- Engaging plot
- Has a supportive cast of characters
First up—tell me more about this relatable main character.
This gave me way more than I asked for. We have a young boy named Tim, who is shy but gains confidence in the future.
I ran it one more time to see what other ideas it came up with—a girl named Lily who gets lost in the woods with her friend.
Can ChatGPT build off the previous response without having me enter as many details? Here I ask it to describe what Tim looks like. These character descriptions will be important because they will be what we iterate on in Midjourney when it’s time to illustrate the story.
So it does understand the previous items in the thread. That’s pretty wild!
This has a ton of use cases, right? Imagine an AI support agent trained on product briefs and previous support tickets. Instead of sifting through knowledge base articles or talking with those case statement bots, it could surface relevant details to help customers solve issues. That could also free up time for support agents to work on actual edge cases and items not covered in the knowledge base.
What else might help us in this story? Let’s see if we can generate some details about where Tim lives.
Interesting. Not entirely sure what I was expecting here. If we need more details for our illustrations, we can always come back to this prompt.
OK, the next thing we need here is a challenge for Tim. He needs adversity. What might be some examples of challenges we can give him?
These are exceptional ideas! All of these—well, probably other than number 3 about financial ruin—would make excellent children’s stories. Although, maybe there’s some sort of genre of dark children’s stories the AI is aware of that I’m not.
Anyways, I choose Tim is nervous about starting a new school year—number 1. In retrospect, I wish I had chosen number 4. Midjourney is so good at designing monsters and ghouls. That would have been a lot of fun.
Just for the heck of it, I ran a couple of prompts in Midjourney with monsters. It’s easy and there’s a lot of text in this part of the post. So a little visual interest couldn’t hurt.
Back to generating our children’s story. We are going to tell a story about Tim’s first day at school. He’s nervous. Can the AI give us some backstory about why?
This is a boring answer compared to how impressive the others have been. I guess that’s fine.
Let’s move on. Let’s get more details about Tim’s school.
OK, that’s something we can use. And our story has a new character. Let’s learn a bit more about Mrs. Jones!
That’s interesting. It must think that I am referencing a Mrs. Jones that I know and not the one from the story.
Let’s ask about the teacher in Tim’s story.
That’s more inline what I was expecting.
What about Tim’s other classmates? Who are they?
Wow, a whole cast of characters!
Presumably, we could dig in and get backstory ideas for any of these characters.
Imagine you’re writing a novel or a screenplay, or just need to generate some D&D backstory—AI can help generate endless concepts. Brainstorming along with the AI and having it drill down on a specific idea is a pretty surreal experience.
So, we have a ton of stuff about the characters. But it is getting late and I want to get to the illustration part.
Can this AI write me a story based on the discussion thus far?
That was way more than I expected.
The AI has taken our theme of being nervous about the first day at school and included the characters we discussed—Mrs. Jones, Sarah, Alex, Jake and Emma—and woven story. While the story is pretty mediocre, this is a pretty spectacular achievement.
At this point, it is clear to me that AI is about to change things—a lot.
As impressive as this is. It feels like this might be a bit much for a children’s story. I tried to ask it to turn it into a poem—in several different ways. Most of those just broke the sentences in odd places. So it looked like a poem but flow.
After about 10 tries, I asked it to write me a new poem about Tim and his first day of school.
OK, this is something that seems more like a children’s book. The sentences are shorter and easier. We have a bit of rhyming going on.
I think this is our story.
But what if I want to edit it? How can I work with the AI to make changes? Initially, I asked it to adjust one thing but found that it was also making changes elsewhere in the story—not ideal.
I figured out that I could tell it to pull out individual paragraphs and focus efforts on those, one at a time.
Oh, wow. This is just like building software. Can it merge these changes back into the original?
Amazing. We targeted a change we wanted to make, worked through those changes with the AI, and then merged them back into the original.
I did this again and again for the remaining paragraphs until we had something decent. It was wildly easy. If I didn’t get something that made sense, I ran the command again or rephrased what I was asking.
This has a lot of implications for various workflows. The clear one to me is writing software. There are already examples of using ChatGPT to solve Advent of Code. This is important because the software doesn’t know about the 2022 Advent of Code. It was trained on information from 2021 and earlier. So it couldn’t copy answers that already were out there.
The workflow here feels like something from the future. Pull a small piece out, iterate on that until it feels right, merge it back in and assess how everything works together. Software, graphics, music, videos, architecture, books, legal documents—all of these things could benefit from this workflow.
OK, let’s see if the AI can generate a few decent titles for our story.
This whole process has me blown away. In roughly two hours we have our children’s story—title and all.
Illustrating the story with Midjourney
Since I had previously explored image generation with Stable Diffusion, I figured I might try something new for this project. I decided to use Midjourney since the generations seemed the most impressive and it had much better upscaling capabilities than either Stable Diffusion or Dall-E.
There’s this kind of odd, kind of genius approach that Midjourney has taken. The entire tool runs on a Discord server.
What’s genius about it is that you can easily see what types of images others are generating and then iterate on those. The Pixar-esque prompts used to create the images in this story were taken from someone in the same channel generating cute, 3d characters.
The quality of those images—and how impressive they were—convinced me to move on from this more hand-drawn style seen in most children’s books. Here was my initial cover art for the story.
Here is the cover that I ended up with.
Let’s walk through an example of how I generated these images with Midjourney.
You might recall I asked ChatGPT for various descriptions. One I didn’t have was of Tim’s mom, who is the main character in paragraph 2. So I had a description generated.
OK, we’ve got a good description—hazel eyes and curly brown hair. Tim’s mom dresses casually.
Let’s see what Midjourney can do.
Well, that didn’t work. All these images are of kids. They are either inside, facing away from the camera or extremely angry.
I feel like the prompt is good though. So let’s re-roll the seed and see what we end up with.
It took a few more rolls to get here. Many had the same issue as the previous images. It was difficult to get something that looked like a house in the background—or at least the outside of one. Many of them didn’t match the style of the story’s illustrations either.
Here, image 1 might be a good option. The subject fits the description. She’s wearing casual clothes despite that not being part of the prompt. At a glance, there don’t appear to be any visual defects in the image—which is pretty rare. See the hand in image 3? The software has an extremely difficult time with hands.
Now let’s have Midjourney run a few different versions of image one. I’m going to select V1 to have it come up with 4 variations similar to image 1.
I was not expecting hands to be added to the image.
I don’t think any of these are as good as the original. I’m going to go back to image 1 in the previous batch and select U1 to upscale that image to a higher resolution.
And there we have it. This is the image that was used to represent Tim’s mom in the story.
I think the Midjourney image generation process makes a lot of sense. With Stable Diffusion finding the right seed and iterating on that made all the difference. With Midjourney you get a handful of images and can either re-roll the seed or create a set of variants from one of the images. Or, if there is a good one, you can upscale it.
The process is not perfect, but it makes a ton of sense. I felt the process was easier and produced better images than Stable Diffusion. Granted this is using cloud GPUs, where I was running Stable Diffusion on my M1 Mac.
The process is not without flaws. One thing that became a bit of a roadblock was trying to get a consistent character. Using the exact same seed and prompt details produced noticeably different characters. That is why I ended up using a superhero in one of the images because none of the other prompts gave me a consistent character.
These were some of the dozen or so variants I tried. None were close to the character on the cover.
I even tried uploading and using the image from the cover in the prompt. I had hoped that might produce a decent result, but the characters were still inconsistent.
It doesn’t seem like Midjourney is capable of iterating on a single character—yet. Once it can do that, anyone with access to Midjourney could potentially create their own Pixar shorts.
There’s more to write
There’s a lot more to write about with generative AI but felt like I need to get this out as quickly as possible. It’s getting late, so I’m going to stop for now.
Short-term this is going to be a game changer for creatives—longer-term for everybody.
In the meantime, here is a bunch of monsters working in a rainbow factory.