Jump to a section
can you create a content marketing programme using ChatGPT or another LLM to product content?
our verdict at the time of publication:
no
Since the explosion of large language models two standout use-cases have emerged:
- Coding assistants
- Writing assistants
While I am not qualified to comment on the former (although I did use ChatGPT for some of the coding of this site), I am extremely qualified to comment on the latter, given my background in journalism, technology and content marketing. The ability to produce a seemingly infinite quantity of written content of passable quality is clearly an exciting prospect for companies and agencies undertaking content marketing efforts. However there are huge issues with an LLM-based approach to content creation, as I will explain here.
Our founding thesis is that ai is an enabler, not a replacement, for human creativity. ai can be a useful tool at many stages during the content creation process, but using ai start-to-finish is short-sighted and potentially harmful. But first, let’s explain how we got here.
November 2022: chatGPT and the platformification of content creation
At the end of 2022 OpenAI launched ChatGPT to an astonished public. Here was a tool you could converse with like a human. Ask a question, get an answer. It felt like chatting with a person. It quickly became the fastest product to reach 100 million users in history, in just 2 months. Facebook took four and a half years to reach the same milestone.
People pretty quickly figured out that the quality of the response depended on the prompt. You could set parameters, give suggestions and provide inspiration which would alter the response. Prompt engineering was coined as the term for this artform.
Marketers, agencies and content creators of different stripes soon wondered what if, instead of asking ChatGPT “what is the height of mount Everest?”, they asked it to write an entire blog post about the height of mount Everest? The written word has historically been quite expensive given the pesky humans needed to write them, and here we have a tool that can do that job convincingly, instantly, and for a trivial cost.
november 2023: the tinkering commences
Perhaps the first widespread example of an LLM-driven content marketing campaign was revealed in November 2023 by Jake Ward, the owner of an SEO agency.
We pulled off an SEO heist that stole 3.6M total traffic from a competitor.
— Jake Ward (@jakezward) November 24, 2023
We got 489,509 traffic in October alone.
Here's how we did it: pic.twitter.com/sTJ7xbRjrT
The tl;dr version is that Jake’s company:
- Peeked at a client’s competitor’s sitemap to get a full list of the articles they had published
- Asked ChatGPT to rewrite every single article
- Posted the rewritten articles on their client’s site (in Jake’s own words, posting 1,800 articles in a few hours)
Since then, more and more agencies have jumped on the generative ai content bandwagon, and there are now hundreds of platforms that promise to write blog posts for you, as well as LinkedIn comments, cold outreach emails, landing page copy, and just about anything else you can imagine. The overwhelming majority of these tools are simply wrappers for ChatGPT or another large language model.
march 2024: Google notices
Core algorithm updates – the chunky, earth-shifting changes that fundamentally alter how Google search works – are few and far between. Most take a long time to arrive (as Google understandably wants to make sure they do not break their core money-making experience), and ever rarer are algorithm updates that address specific issues that are degrading the search experience. I believe I am correct in saying all of these updates historically have been introduced to tamp down on the latest wheeze from the SEO industry. And in March 2024, it was time to address ai generated content.
Not only did Google issue new guidelines around the use of generative ai in content marketing, it slapped many offending sites with an embarrassing-but-accurate “pure spam” penalty.
Tackling ai spam has clearly become an existential issue for Google, but this effort is clearly hampered by the fact that the whole point of large language models is that they sound a lot like humans. So distinguishing between human-written articles and gen ai filler can be tricky. More on this later, however here are a few telltale signs:
- Generative ai generally won’t issue opinions or render verdicts. It will give both-sides arguments with pros and cons lists.
- There is no narrative structure unless specifically requested, and none of the writing flourishes you would associate with normal written material – no metaphors, no callbacks, no alliteration or plays on words.
- It is unfailingly polite and won’t swear. In fact, there are entire jailbreak communities dedicated to making ChatGPT say things it isn’t supposed to.
Which brings us to today. LLM quality is improving, but we are still a long way away from having high-quality copy available to us with a simple text prompt.
So let’s break down the arguments for and against using ChatGPT or another LLM to generate articles for your website.
the arguments in favour of ai generated website content
low cost
There’s no denying content generation with ai is dirt cheap. To draw some comparisons, when I was writing full-time I maxed out at $0.40 per word (I always benchmarked myself against Carrie Bradshaw’s $4 per word, so in that sense I was a miserable failure). An average article would be 1,400 words, netting me $560.
GPT4, OpenAI’s newest model, is capable of generating an infinite amount of content. I specifically asked it what the limits on daily content generation are, and it told me:
I don’t have a fixed limit on the number of words I can generate for you per day. You can ask as many questions as you like and request as much content as you need. I’m here to help with any topic, big or small, whenever you need assistance. Just keep sending your questions or requests, and I’ll be ready to respond!
So for the sake of comparison, let’s say you generate three 1,400 word blog posts per day. Over the course of a month you would generate 117,000 words (roughly, I’ll let you have a few days off). Taking into account OpenAI’s $20 per month subscription fee, you are paying your “writer” $0.00017 per word.
If Jake generated all the articles for his SEO heist within one month’s billing cycle, assuming a 1,000 word length, he paid a rate of $0.000011 per word, or 0.0028% of the rate I was being paid.
You can’t buy many Manolo Blahniks on a rate like that.
speed
This one hardly needs any explaining. Between the initial idea, doing my research, writing an outline and messing around with drafts for a while, this article probably took me 6 hours over about 2 weeks to write.
Anything generated by a LLM, either through their primary interface or API, is delivered instantaneously. LLMs can also generate as many items as you like simultaneously, so you can’t even compare them to humans using words per minute or any similar metrics. It’s just “as much as you want, instantly.”
the arguments against ai generated website content
quality
Clearly, this is the primary criticism of any copy created by LLMs and has been well documented. But to briefly cover the highlights.
LLMs lie all the time
If I write something inaccurate on this website, it will be called a lie. If it’s about a person, it could be called libel or defamation, which are crimes. But if an ai does it, we call it a “hallucination”, which is a fantastically conceited way to describe a lie.
This is a fundamental issue with how ChatGPT, Gemini, Llama and others are built. They are not databases of information which can be recalled reliably. They are probability engines which return different answers to the same question, and unless you understand how to constrain their functionality they are easy to steer in unexpected directions.
There have already been countless examples of high-stakes errors, including:
- A chatbot for New Yorkers recommended breaking the law
- Air Canada’s ai offered a customer a discount which didn’t exist
- A Chevrolet dealership’s chatbot sold a car for $1
- Somewhat contradicting what I said above, DPD’s ai chatbot started swearing at customers
Their copy is C+ at best
Inaccuracies can be overlooked in low-stakes scenarios. If I say Barack Obama was born in 1963 you’ll probably think, yeah, that seems right (it was actually 1961). And let’s be honest, most content produced for marketing is pretty low-stakes stuff.
But what cannot be forgiven in those scenarios is bad, boring and unhelpful copy. If your marketing content is there to entertain, it needs to be entertaining. If it’s there to help people solve a problem, it needs be succinct and in the best format.
Those of us in the industry can already recognise ai copy even though it’s only been out in the wild for a little while. It has a vibe. A smell. Always average-length sentences. The sort of friendly-neutral tone that someone uses for the whole evening before they murder you. No narrative progression or flow, just a series of grammatically-correct sentences about a given topic.
Once you understand how these systems were developed it becomes clear why this is the case. OpenAI probably thought a system trained on the entire corpus of human writing would equal the sum total, but in reality it equals the average. Large language models produce new content in the same way that I produce something new a few hours after I’ve eaten a big meal. I’m not the only one using toilet metaphors; Ed Zitron describes LLMs as “a new kind of asshole that can shit more than you would ever believe”.
Enough complaining, how about an example? Years ago I worked with eve Sleep, who did some wonderful long copy ads on the London tube. This was my favourite:
So lets see what the leading LLMs can do. This is the prompt I used:
You are writing an advertisement for a poster on the tube in London for a kind of mattress called eve. The features to highlight are:
- They’re made from a new kind of memory foam
- They’re cheaper than other mattresses
- You can only buy them online
- You can save £50 if you use the voucher code NOBRAINER
- It’s delivered in 3 days
I want the overall feeling to be that having this mattress will make you a better person because you can get a good night’s sleep. Use a tone of voice that fits this message. Maximum 125 words.
And here’s how each model performed. Note I haven’t changed any of the formatting. Click to expand each model’s response.
GPT4
🌟 Upgrade Your Sleep, Upgrade Your Life! 🌟
Discover the eve mattress – crafted with cutting-edge memory foam for unparalleled comfort. Revolutionize your nights without breaking the bank – eve mattresses are budget-friendly and superior in quality. Exclusively available online, making luxury accessible at your fingertips.
Ready for the best part? Use voucher code NOBRAINER to unlock £50 savings on your order! Plus, with lightning-fast delivery in just 3 days, your dreamy slumber is closer than ever.
Invest in yourself with eve – because when you sleep better, you live better. Say goodbye to restless nights and hello to a brighter, more energized you. Elevate your rest, elevate your life. Try eve today and awaken refreshed, revitalized, and ready to conquer the world.
Side-note: This ad seemed a little long to me so I asked ChatGPT how many words it was, to which it confidently replied 98. Reader, the ad is 126 words.
Gemini 1.5 Pro
Sleep like you give a damn.
Introducing eve, the memory foam mattress that doesn’t cost the earth – or a fortune. Order online, save £50 with code NOBRAINER, and sleep soundly on your new eve mattress in just 3 days.
Because a good night’s sleep makes for a better you.
eve Sleep better. Live better.
Claude 3 Sonnet
Sleep Better, Be Better with eve
Introducing the revolutionary eve mattress, crafted from cutting-edge memory foam that conforms to your body’s curves, cradling you in blissful comfort. Say goodbye to restless nights and wake up refreshed, ready to conquer the day with renewed vigor.
But the best part? eve’s premium quality comes at an unbeatable price, making luxurious sleep accessible to all. Order yours online today and enjoy the convenience of swift delivery within three days.
Unlock the ultimate slumber experience with the NOBRAINER voucher code and save £50. Embrace the transformative power of rejuvenating sleep and unlock your full potential. eve – where dreams are made, and better living begins.
Llama 2
You deserve to feel refreshed and revitalized every morning. With the Eve mattress, you can achieve just that. Made from a new kind of memory foam, this mattress will ensure you have the best possible sleep. Not only is it made with the highest quality materials, but it’s also cheaper than other mattresses on the market. And, for a limited time, use the voucher code NOBRAINER to save £50! But don’t worry, buying online has never been easier. Your mattress will be delivered in just 3 days, giving you plenty of time to get excited about your upcoming slumber. Treat yourself to a good night’s sleep, and become a better version of yourself with the Eve mattress. Order yours today.
Unsurprisingly GPT4 did the best job, but I would say these range from average to horrible, and certainly nowhere near something I would be happy putting out into the world if I had a brand to protect.
There are already hundreds of plugins and GPT layers which promise to “humanise” copy from popular models, but in my testing all they do is take out the thesaurus and replace words with less frequently used words. Which, as any writer will tell you, is not the way to make copy better.
You can also experiment with more and more elaborate prompts to try to squeeze some elegance from these models but again, in my experience so far, this is a dead end. Blogging platform Medium, which has always had more of a focus on quality than other destinations, recently threw in the towel and blanket banned ai content because, as things stand today, there is simply no way to make these systems generate good writing.
ethical and legal problems
The key to a good LLM is training data, and in their quest to create the biggest, most powerful models, companies have found themselves making some questionable ethical decisions. In some cases they scraped the copy from entire websites, which is a legal grey area, and in other cases they committed direct violations of usage terms in their quest for ever more training data.
While the legal dust settles users are in the wind, and companies with strict compliance teams might be prevented from using generative ai tools or, even worse, be asked to remove content created with generative systems which can’t prove a clean bill of health for their training data.
environmental issues
As you might expect from generative systems that consume monumental amounts of training data and compute, generative ai systems are power hungry. The best current estimate has OpenAI using half a million kilowatt-hours of electricity daily which is equivalent to 68,000 extra people taking transatlantic flights every year.
Visual generation is even more costly, with some experiments showing a single ai generated image consumes the same amount of juice as fully charging your phone.
model collapse
One emerging issues with the proliferation of gen ai content is a scenario known as model collapse, where a model’s output is used as its input, creating a snake-eating-its-own-tail scenario.
We already know from reporting from the New York Times that OpenAI blew through literally all the written material they could lay their hands on while training GPT4 which includes (because this was before organisations could opt out of scraping) the entire open web.
Increasing the sophistication of these models requires more and more training data, so naturally companies training LLMs will ingest anything they can get their hands on (including this post, presumably).
A study from Originality.ai, which tries to detect ai-generated content, concluded in March 2024 (after Google’s spam-squashing update) that 10% of top-25 Google results were written by ai. So let’s assume, conservatively, that 25% of new written content online in the last six months is ai generated. That means GPT5 or GPT6 is being trained on a not-insignificant amount of material that was created by its younger siblings. And, as previously mentioned, it’s fiendishly difficult to distinguish between human and ai produced writing at scale, so there may not be a reliable way to exclude this material from the training data.
It is an absolute certainty that this process has already begun, and will lead to a decline in quality and originality over time. Without significant changes to how these models are developed, it’s entirely feasible that we see a peak in performance before a gradual decline as the quality of training data degrades. We could have seen it already. By using LLMs to generate content en masse, hungry marketers may inadvertently end up rendering them useless.
An organism cannot continue to grow without fresh nutrients, and generative ai models are the same.
marketing performance
And here we come to the most pertinent reason not to use ai to generate content for your website. Most content marketing programmes – especially those being driven by gen ai content – are focusing on driving traffic through organic search. As mentioned, the sheer volume and velocity of this kind of content is an existential issue for Google, and they are not being shy about tackling it.
As you would expect Google is staying quiet about exactly how they are detecting ai content, however aside from language analysis there are a few obvious giveaways:
- A website suddenly publishing hundreds or thousands of new pages in a short span, especially pages that contain cookie-cutter ~800 word blog posts
- GPT ticks included in the text, such as the now-infamous “As an AI language model“
- Formatting clues that the material has been mass-published; usually a lack of mixed media, no links to external sources etc.
- Little or no author information or a fake author with an gen ai profile picture (generative ai images are much easier to detect than generative ai text)
It’s not just an issue for long-form writing either – ai consistently underperforms human copy in ads too. Tom Byrne, the CEO of UK agency RocketMill (where I worked for a number of years), told Justin Campbell on the Future Fuzz podcast that gen ai copy “doesn’t have the same ring, the performance metrics prove this time and time again.”
Earlier in this piece I told you that ai writing will always lack the nice storytelling elements that tickle our squishy human brains. Well, here’s where I prove this piece wasn’t written by ai, because get ready for an absolutely textbook callback. Remember Jake and his SEO heist? Presumably he lost that client because after seeing some nice traffic gains in the short term their organic search visibility went to near-zero due to the deluge of spammy ai writing on their domain, undoing years of presumably good, honest marketing work. And this story has been repeated across hundreds of domains in the last few months.
Large language models have infinite uses but content creation, especially at scale, is not one that makes sense. At best, you’re creating a bunch of low-quality ooze that’s of no use to anyone. At worst, you’re blowing up your entire organic search strategy.
Posted by
About the author
ai tools used to produce this content
GPT-4 for ad examples
Gemini 1.5 Pro for ad examples
Llama 2 for ad examples
Claude 3 for ad examples