The making of “Harmony of Hearts” AI storytelling.

ChatGPT wrote the story based on a premise I had. Then I converted that to a screenplay format with ChatGPT. I then had it rewrite it with a different character (robot instead of a man). And then a different ending (he dies). Then I had it create more narrative description of the planet and the alien woman. So it was iterative writing where I mostly did editing (art direction). The writing was done in under 1 hour.

I then had the AI create prompts for the Midjourney AI images. And I massaged that for the right formatting and in Frank Frazetta fantasy art style (he died in 2010). About half of the 200 or so images were junk (3 arms, weird expressions, or just freakish). It takes about 3 minutes to create an image. That is 600 minutes of button pushing. That is 10 hours. But because I was writing and building sound track stuff at the same time, I had parallel production (3 screens going).

Midjourney sounds an audible alarm (beep) on completion of an image. So then I could render it. It only allows three processes going in parallel. This will change eventually to allow more images being produced simultaneously.

The music was split into “stems” of voice and instruments using LALAL.AI online tool. Then I could manipulate the song components.

Also for the dialog and narration, I used ElevenLabs AI human voices (text to speech). The robot voice was from another weird site that specialized in AI robot voices (too weird).

I used Kdenlive open source video editor and Audacity open-source audio editor offline. For “Ken Burns” motion effects, I used a utility program called PhotoFilmStrip, also open-source. It batch processes still images and converts them. The Ken Burns effect is a type of panning and zooming effect used in film and video production from still imagery.

Sound effects were from free SFX libraries online.

That’s it in a nutshell.

A human brain still has to do the thinking driving. “Orchestration” is still required for the AI-generated components. Humans are still needed. But I could not have produced this before AI easily. The images alone would have cost about $6,000 minimum. More likely double to triple that.

The whole creative process would have take years — instead of hours or days.

Return to Top ▲Return to Top ▲