KevinCarlson.codes

Featured Image

I Tried Running an AI Content Farm, and It Was… a Lot

Generated with Google Gemini and Imagen 3. Cropped.

High on my own success with the AI generated Liminal Space Academy novel, I decided to see what else I could prompt into existence.

AI Disclosure ✨ While this post discusses the use of AI for content generation, I’m not using AI to write my analysis in this article. Any responses from the AI assistant will be marked as such.

Conceptualizing a Surrealist YouTube Channel

Again, I didn’t want this stuff diluting my Personal Brand™. So I asked Gemini to come up with a name for a YouTube channel, based on a general idea of the content I wanted to post. Among a list of other suggestions, it came up with Ctrl + Alt + Existential Crisis, which felt… chef’s kiss.

Can you come up with something as iconic and catchy as Skibidi Toilet? I’m looking for peak brain rot

This time, instead of a Dreamcore Princess, Gemini interpreted brainrot to mean Spoon-Head Sproutlings.

Concept:

  • Imagine tiny, humanoid sprouts, but instead of heads, they have large, wobbly spoons.
  • These Spoon-Head Sproutlings emerge from cracked eggs that appear randomly in mundane locations.
  • They hop around erratically, making a repetitive, high-pitched “Spoon-Sproing!” sound.
  • Their spoons occasionally scoop up random objects (buttons, paperclips, tiny rubber ducks) and then fling them into the air.
  • The spoons will also occasionally knock into each other, creating a distinct “Clang-Sproing” sound.

Google Gemini (Flash 2.0 model)

Truth be told, the idea was giving Billion Surprise Toys rather than Skibidi Toilet, but I decided to roll with it. Naturally, I made the mistake of asking what these creatures would look like. Even though Gemini clearly intended for them to be cute-coded, the result was pure, demonic hellspawn:

A Spoon Head Sproutling stands in a grassy field. It's a creepy little humanoid creature with a round tummy, frog-like skin, Baby Yoda's eyes, and two leaves sticking out of the sides of its head like antennae. The business end of a metal spoon is sticking out of the top of its head.

Insert Michael Scott No GIF here. (AI generated image by Google Gemini with Imagen 3)

The story needed an antagonist, so Gemini gave me the concept of the Gastronomic Glitch – a sort of eldritch abomination, constantly consuming itself as both the eater, and the meal. I wanted absurd, and Gemini delivered.

Of course, generating video content takes a lot more resources than text, or images, so the free samples are a little harder to come by. While OpenAI’s Sora would be a logical choice for generating Spoon-Head Sproutlings content, they only offer it as part of a $20 (or more!) per month subscription, which includes a lot of other generative AI services that overlap with stuff I get for free from Google. Unless I can tap into a lot of unexpected demand for wobbly spoons and their Clang-Sproing noises, that doesn’t seem like a sound investment.

And during the writing process for this, OpenAI temporarily disabled access to Sora for new accounts, so I wouldn’t be able to use it even if I gave them the $20.

Instead, I found two services that would let me explore the world of Spoon-Head Sproutlings with a free tier. InVideo AI could generate vertical videos complete with an original script, AI narrator, watermarked stock footage, weird transitions, and flashing subtitles. I dumped Gemini’s concept into InVideo AI’s prompt box, and got a weird story time video full of nonsense about these spoon creatures. Not exactly my artistic vision, but I uploaded a few shorts to see if they’d gain traction.

I also found PixVerse, which generates brief AI video clips from a prompt, instead of stitching together stock footage. PixVerse gives out 60 free credits per day, and a basic ~5 second video (no sound effects or narration) takes 30 credits. After a few days, I managed to spit out enough clips to stitch together this monstrosity, which I set to one of the songs from the original set of Skibidi Toilet videos:

It’s not clashing, it’s a juxtaposition! Fire! A true representation of the ethereal plane! (Footage generated with PixVerse AI)

Yeah, the only way to get Nelly Furtado on the video without a Copyright strike was to use the YouTube Shorts editor in the mobile app. It also would only give me either a 15 second clip or 60 seconds, which means the song cut out after 15 seconds because I didn’t have a full minute of footage. So I had to use the overly simple editing interface on my phone to trim the clips down to fit everything in just under 15 seconds.

Technically, I could have sent the raw clips to my computer, and then used normal video editing software to get that precise 15 seconds. I also wanted to slice up the fist clip with the monster to put jump cuts to make it feel a bit more menacing. But then I would have had to send the final version with no sound back to my phone to upload it and add the music, which feels like way too much work for a video that still wouldn’t look the way I wanted it to.

Trying AI Songwriting with Suno

But if you think about it, the secret to getting views on YouTube is to make a music video. After all, the first video to hit a billion views on the platform was Psy’s “Gangnam Style”, back in 2013. The only other videos I can think of with such a high view count are other songs, like Luis Fonsi and Daddy Yankee’s “Despacito”. Now this might be anecdotal, but I’ve definitely played “Your New Favorite Song” by Wallows way more times than any Skibidi Toilet episode – much less, say, Grind Hard Plumbing Co’s Chang Li 100 mile challenge, even though I would gladly eat skinny mac and cheese for Edwin and his Premium cinematic storytelling.

A screenshot of a YouTube Music notification, which says You are officially a top listener of Wallows.

I might also be a top viewer of Grind Hard Plumbing Co Chang Li content, but I don’t think YouTube has a badge for that

I already knew about Suno because of an old coworker who loves to hit up the group chat with hyper-specific songs about our team’s inside jokes. (For legal reasons, I can’t explain why my coworkers always insisted on parking the rental car next to the dumpster.) But the reaction that the AI-generated Dr. Pepper country song got on the WVFRM Podcast really opened my eyes to the possibilities:

Ellis’s beautiful and unconditional love for the AI Dr. Pepper country song is my Roman Empire. Also, Marques is upside down because he was in Australia for this episode.

Naturally, I had to see if this AI model could do anything fun with Gemini’s Spoon-Head Sproutlings concept. It gave me this song, which Suno describes as whimsical, funky, and low.

Spoon-Head Sproutlings splashing all around

Tiny rubber ducks make a whistling sound

Kitchen sink’s a party

Soap suds on the floor

Every time I blink

They’re flingin’ ducks some more

Based on Ludwig’s YouTuber music tierlist criteria, I’d have to put it in A Tier – pretty good, but not my cup of tea, or something I’d add to the playlist (unlike Ludwig’s cover of “Con Te Partiro”, even though he’d say I need help for letting that sneak into my 2024 Recap). But I had a feeling that, with some work, Suno could make something I’d listen to unironically.

Generating Fire

As part of a different project, I came up with an idea for a song – just not enough of an idea to be able to flesh out lyrics, or a melody. I wanted a radio friendly pop punk song that uses metaphors and double entendres to tell the story of a guy who got roped into helping out a friend who bit off more than he could chew with a romantic encounter. For the record, I wanted to set up an establishing character moment between two friends where one of them played a song that he knew the other hates, only to find out that his friend has mellowed out a lot since the last time they got to hang out.

With that concept in mind, I was able to prompt Suno with the general idea of what the song was about, and a few hints as to the style I wanted. While it didn’t wow me the same way Gemini’s TikTok brainrot interpretation of Tara Gilesbie’s “My Immortal” did, I liked it enough to put it in S Tier below KSI’s “Thick of it”:

“Appetite” by Third Wheel (Song generated with Suno, edited; Image generated with Google Gemini and Imagen 3, edited)

The (non-Gastrononic) Glitch

For reasons I haven’t been able to figure out, Suno seemed to have trouble with the vocals on the first line of “Appetite”. The lyric text says, “You called me up, it’s 2 AM.” But when I played the first V4 remastered track, it sounded more like:

Yuh livvy NAMI up, it’s 2 AM

Maybe this was like the Yanny/Laurel thing from a few years ago, where people perceived the same recording one of two ways due to some weirdness in the audio frequencies (some Internet sleuthing revealed that the recording was supposed to say “Laurel”, which is what I hear even though I see the dress as white and gold). I tried redoing the V4 Remaster to see if that would help, but the vocals still sounded glitched.

The Edit Button

Suno also offers some editing tools, but they’re not as fine-grained as I’d like, and they produced some odd results for this particular song. I could get it to say the correct opening lyrics, but then it would inevitably glitch out somewhere else. Most of my attempts to replace the first verse got confused at the guitar break, and either tried to put the words from the second verse over it, or just skip it entirely. One version sounded ok until about 44 seconds in, when I swear it says, “Double sir, mew Annette’s in the store, There we’s get to comin’ out yo’ sore.” Another one got confused at the same part, but this one sounded like, “Double Serb, center view bad Hal, And she was trotting in the sand, come Sal!” Either way, it felt like a bad day on a Cardassian space station, as if I hadn’t had enough Spoon Head content for today.

For reference, the actual lyric is, “Double servings, no leftover blues, Guess I’m dessert, hope I don’t refuse,” which is only slightly more coherent than the line about a lady looksmaxxing in public.

I’m slightly impressed that both of the glitched versions of that verse still rhymed in the right place, even though the sound was nonsense. If anyone reading this happens to be an expert on speech synthesis, or audio frequency analysis specializing in speech, I’d be very curious to know what’s going on here.

Part of the problem may have been that I originally generated “Appetite” with Suno’s V3 model, and that version of the song already sounded glitched before I remastered it to use V4. I was able to get better results editing a non-remastered V4 song, “Ohio Rizzler”. Suno originally generated the line “Rizz level Elon, you can’t measure my depth” which, um, hits different in 2025. I was able to select and replace both(!) verses that had the line, and change it to “Rizz like a demon”, which was good enough to placate my woke mind virus. The Rivian-friendly version of “Ohio Rizzler” sounds as natural as if the cringe had never happened.

Did I mention that each one of these edits costs you credits? (I didn’t mean to rhyme that time.) I get that it’s expensive to run generative models, but it’s a little off-putting to charge extra to fix something that the AI messed up in a song that was pretty great otherwise. It’s even worse when the edit didn’t fix the original problem, or just glitched even harder.

In the end, I fixed “Appetite” by generating stems for the version of the song I liked best, which separates the instruments from the vocals. I did the same for one of the clips that pronounced the first line right (but messed up later), and hacked them together in Audacity.

Introducing Double Feature

While Suno typically gives very general options like genres and a choice between male and female vocals, it also has the ability to create Personas from songs you’ve already generated. This means I could take the settings from “Appetite”, and generate new songs that (allegedly) have a similar sound. Naturally, I created the band Double Feature, and gave them an entire story arc, with the songs, “Boys Don’t Share”, and “Cool Again”, which are implied to be the fallout after the events of “Appetite”, and the duo’s eventual reconciliation, respectively.

The Sunk Cost Fallacy

To give an idea of how fast things are moving in the AI space these days, during the course of writing this article, I saw two announcements for AI generated video models, being the Runway AI Gen 4 model, and Google’s Veo 2. Just for funsies, I snagged a month of the basic Runway AI subscription, and assumed that my Gemini Advanced trial would give me Veo 2.

Let’s just say, I did have access to Google’s AI video generator, but I shouldn’t have. I didn’t see pricing information anywhere (possibly because I’m oblivious, but not necessarily for that reason), and ended up racking up a bill to rival Lobster Sausage, trying to visualize the world of Spoon Head Sproutlings. By the time I realized how much this was starting to cost, I was stuck with a bunch of random clips of kitchen madness, but not quite enough to do anything interesting with it.

In the end, I managed to cobble together enough footage between Veo 2 and Runway AI’s Gen 4 model to stitch together a music video for the Spoon Head Sproutlings song I generated with Suno:

I don’t want to talk about the budget for this video

Overall, I think Veo 2 was easier to use and had better results, though I would have rather spent the money on Chef Michael Symon’s Pierogis. Runway AI requires the user to provide an image with their prompt. In theory, that should help it stay closer to your artistic vision, assuming the initial image is consistent with that vision. In practice, it’s a bit hit or miss. It did a decent job with the saxophone spoon guy (based on an Imagen 3 generated image from Gemini), but it barely added any movement to the image of rubber ducks in the blender, and was only marginally better with the ones on the stove. I don’t even know what was going on in the scene with the toaster (though part of this could be that Gemini only generates images in a 1:1 aspect ratio, and I had to crop the images to be 16:9 like the rest of the video).

Takeaways

Needless to say, I did not create the next Skibidi Toilet, and Double Feature still hasn’t had its Gangnam Style moment. (In this case, that would be David from the WVFRM Podcast saying he missed Ellis’s laugh and his stupid hair after one of them got back from vacation.) As Dreamcore Princess™ would say, “My aesthetic [is] being oppressed!”

At the time of writing, Ctrl + Alt + Existential Crisis had two subscribers, barely over 2,000 video views across the channel, and the most popular video had 517 views, with 11 likes, and 1 comment. That’s almost respectable for a brand new channel with zero attempt at marketing, but it’s not the kind of clout I’d want to bring to a haunted hotel when there’s a non-zero chance of Brent Rivera showing up.

Overall, I think Suno has the better product out of everything I’ve tried. With nothing more than some simple proompting, it can spit out some surprisingly catchy songs. I will say that it leans a bit heavily on certain words (seriously, how often do real humans sing about neon?), though I did also notice the word “madness” coming up a lot across everything I generated. The songs I generated also started to have a similar sort of sound to them, even though I tried my best to vary the styles I mentioned in my prompts. I suspect that it has a lot to do with the songs I was liking. Derek Shaw actually gained a follower, and I was surprised to find out that their music style was vastly different to his.

More than that, I have to say that video is still a bit of a hot mess. The models I tried would only let me generate up to 8 or 10 seconds of footage per prompt, and the output had visual glitches, and poor consistency across different sessions. It’s also resource intensive to run these video generation models, and that’s reflected in the pricing. From what I’ve seen, I’d say AI video is about as close to shaking up Hollywood as the Spoon Head Sproutlings are to threatening Skibidi Toilet – there could be something interesting there, but it needs a lot more work.