Wordlab AI Surpasses Human Writers Across 10 Different Metrics
Cognitive Content Planning & Generation
Published: 02 April 2024
Table of contents
Start Free
It is 100% free to get started with Wordlab.
- Document Editor
- Topics Clusters
- Keywords Clusters
- Batch Generation
- Multi-modal Chat
- 40+ Marketing Templates
- And more…
From the day first, our mission is to leverage generative AI for the benefit of individuals and businesses. We successfully built tools that do an amazing job of helping users plan and generate content. However, in the past few months, we observed that the opinion is shifting towards AI-generated content. This is partially due to the latest updates from Google and also because people have become more aware of its usage.
We took this new approach positively. This is exactly the direction we were headed in the first place — to use AI for generating content that is helpful, insightful, and factually accurate. But it also made us take a step back to think and re-calibrate our course to build even more sophisticated AI tools that could either be used alongside humans or in their place. But more on that later.
There are no two opinions when it comes to the prowess of AI in generating high-quality content. We all have used tools like ChatGPT which can write lengths of content on virtually any topic. It is now an established fact that modern large language models have both the capability and capacity to understand any task to generate sophisticated responses.
However, there is a strong scepticism from many including veteran copywriters and marketers that AI-generated content is mediocre and not worthy of greatness due to the following shortcomings.
Planning: Unlike humans, large language models do not ponder or plan before writing. They are autoregressive models trained to rapidly predict the next token to generate the best response they can gather on the fly.
Freshness: Every large language model has some knowledge cut-off date beyond which they do not know new updates and events. To compensate for this knowledge gap they often produce made-up stories popularly known as “hallucination”.
Factual accuracy: Unlike humans, the majority of these AI systems have no access to the internet. This means, that for producing content they are dependent on their embedded knowledge and training data and have no way of cross-referring with the latest information.
The human voice: Out of the box, all large language models are fine-tuned to generate content in a flat robotic voice that lacks human touch and fails to resonate with the reader.
Humanizing AI Content Generation Process
Existing methods of generating content using AI are flawed. To keep using AI for generating content, we need to address these shortcomings and replace them with a more robust and reliable process. In short, we need a system that mimics humans from understanding the topic to research, writing and beyond.
At Wordlab AI, we have successfully built this framework and named it Cognitive Content Planning and Generation (CCPG).
Cognitive Content Planning & Generation (CCPG)
CCPG is a framework designed to mimic humans’ creative process of planning and writing great content. It follows the same cognitive process that allows humans to plan, reason, and write rich content.
Along with our research in this direction, we also spoke to some veteran copywriters and content marketers to share their own process of creating content. We then broke down their process into six primary steps:
1. Topic understanding: Using existing knowledge to understand and brainstorm new ideas.
2. Preliminary research: Reading what’s already out there to get a better sense of the topic.
3. Planning and outlining: Plan the direction, and material to use, and outline the structure of the article.
4. In-depth research: Do more focused research based on the outline and specific sections.
5. Writing: Bring ideas and research together to write relevant content.
6. Editing: Look back to assess to improve and make adjustments where needed.
Overall, the above process seems like a solid plan for writing great content on any topic. However, to map this into an AI system, we had to make a few adjustments and add additional layers. The current process of CCPG from a higher level looks something like this:
1. Receives the initial input: The input could just be the topic title or additional information like the audience, target keywords, language, and more.
2. Asks relevant questions: Based on the inputs and existing knowledge, the model generates a list of relevant questions as part of its brainstorming process.
3. Brainstorming new ideas: Step 2, helps the model to expand its knowledge enabling it to brainstorm new ideas and to add further value to the parent topic.
4. Critique Feedback Engine: While the primary model is busy generating new ideas, it’s essential to audit its approach. This is where the critique feedback loop engine comes in. This iterative engine looks for any gap or room for improvement in the suggested approach and gives feedback to improve it.
5. Research: A very crucial step divided into two sub-processes. In the first step, the system does shallow research on everything ranking on the top pages of Google — mimicking the preliminary research of humans. Next, it does more in-depth research on specific topics and ideas that are usually not covered in the first research. Finally, it gathers all the information that is stored as a knowledge graph for upcoming steps.
6. Outline: Generates a comprehensive outline of the article based on what the system has seen during its research.
7. Content generation: Writes content for each section and sub-section while cross-referring with all the relevant chunks of information fetched from the knowledge graph. This ensures freshness and factual accuracy throughout the article.
8. Editing: After finishing the first draft, the system consults with the Critique Feedback Engine once more to enhance readability, and optimize the content for SEO, interlining, and keyword placement.

After developing the CCPG framework, we tested extensively on a number of topics and every time it did not fail to amaze us. The results were so good that we could not wait to share them with the rest of the world.
But we decided to take a step further and test it against an actual human copywriter as well as other AI content-generating platforms. We wanted to test whether this framework can confidently be adopted by anyone without having any reservations whatsoever.
Testing CCPG Framework
Our team is already working on developing smart AI agents to carry out potentially any marketing-related tasks at a human level and it only made sense to equip one of these AI agents with the CCPG framework.
The task of the agent was simply to rely on the Cognitive Content Planning and Generation framework for generating content on any given topic for any given target audience, language, or keywords.
Design of the Test
For this exercise, we picked the topic “Understanding the Role of AI in Job Market Disruption: A Comprehensive Analysis”. We chose this topic so that we can test the relevancy and factual accuracy of all the contestants given a lot has changed around the job market due to AI in a very short time recently.
We picked six contestants for this study to write an article on the same topic. We then anonymised their generated articles for an independent group of writers and content marketers to rate on ten different metrics.
Note: None of the articles were edited or enhanced in any way. They were used as is in the study to keep things consistent and fair.
Article 1 | Generated by ChatGPT Plus |
Article 2 | Generated by Byword.ai |
Article 3 | Generated by Wordlab AI with CCPG |
Article 4 | Generated by ChatGPT |
Article 5 | Generated by a human content writer |
Article 6 | Generated by Jasper.ai |
Evaluating Test Scores
To evaluate the quality of each contestant, we asked the group to read each version and rate them independently on ten different metrics. We wanted to evaluate them on metrics that could help the end user judge the effectiveness of the content in the real world.
The metrics we used for evaluation:
- Richness
- Relevance
- Readability
- Domain expertise
- Naturalistic
- Interesting
- Topic Coverage
- Freshness
- Shareability
- Content Length
1. Richness
For this metric, we wanted people to rate the content on the basis of the article covering the topic thoroughly, providing valuable insights, detailed explanations, and practical examples.

Content generated by Wordlab AI using the CCPG framework not only surpassed the human writer but also all the other AI content-generating platforms. It was an important metric as it defined the prowess of understanding the topic, adding more value, and filling it with rich and updated information.
2. Relevance
For this metric, we expected people to rate the content on the basis of its relevancy with respect to the topic. Even for this metric, Wordlab AI outshined the human writer and all the other AI tools except for Byword.ai.

3. Readability
Another important metric for judging the quality of the content. For this metric, we asked people how easy was it for them to understand the text and follow the structure. Again, Wordlab AI outperformed the human writer.

4. Domain Expertise
For us, this metric is of the highest value as it shows how well the system made use of its resources to fetch the required information and used it to curate rich and insightful content. For this metric, we asked people, did the author demonstrate domain expertise in the content?

The Cognitive Content Planning and Generation framework enabled Wordlab AI’s agent to gather all the relevant information and use it to write insightful sub-sections. This resulted in securing the top position for this metric surpassing both the human writer and all other competing AI content-generation tools.
5. Naturalistic
To counter one of the growing scepticism about AI-generated content i.e. it sounds robotic, we added a metric to determine how natural the content felt when the actual readers read it.

People gave the highest ratings to the content generated by Wordlab AI with CCPG followed by Byword.ai and the human writer.
Ironically, the content generated using AI was ranked lowest in terms of sounding robotic as compared to actual human-generated content.
6. Interesting
It’s important for the content not to be boring so that readers can stick around and get the most out of it. We decided to make this a metric for judging the quality of the content and asked people about their overall feelings about the article.

In terms of interest, Wordlab AI and Byword.ai came neck and neck while the rest including the human-generated article struggled to wow their readers.
7. Topic Coverage
This is perhaps the most important metric after domain expertise. We believe that the article should cover all the corners so the reader does not feel unsatisfied after reading it. Therefore, we included this as a metric and asked people the question:
After reading the article, do you feel like you have enough information about the topic?

For this metric, we were expecting the human-generated content to win by miles as generally, humans are better at encapsulating broader ideas into well-crafted content that is covered end-to-end.
However, even for this metric Wordlab’s CCPG-powered content outperformed the human writer and came to the top.
8. Freshness
To counter another popular scepticism i.e. AI-generated content is stale. We simply asked people if the article contained the latest information, stats, or references.

Again, Wordlab AI-generated content beats the human-generated content by miles. However, for this metric only, ChatGPT Plus came out to be a winner.
9. Shareability
Content sharing is paramount in inbound marketing. It’s one of the key metrics by which we can determine whether the content is producing good results or not. For this, we simply asked people how likely would they be to share this article with their friends.

This was another blow to human-generated content as it came fifth from the top while the top position was again secured by Wordlab’s CCGP content followed by Byword.ai.
10. Content Length
Finally, we did a small comparison in terms of content length. For the given topic, it was expected to get lengthy content since it says “comprehensive analysis” in the title. However, despite several attempts, we could not produce lengthy content from ChatGPT Plus, ChatGPT, and Jasper.ai.

Wordlab’s CCPG framework produced the lengthiest content followed by the human writer and Byword.ai
What’s next?
With this study, we tried to demonstrate the power of the Cognitive Content Planning and Generation framework in producing high-quality, rich, factually accurate, and SEO-optimized content.
This will be made available to all paid plans in the near future and will be embedded in our upcoming Projects and Clusters feature which will allow the users to create the entire content planner with SEO insights and generate content using the CCPG framework in bulk.
For any questions or concerns regarding the study, please email us at info[at]wordlab.ai
Get started for free
Sign up free to experience the next-gen content generation and inbound marketing. It’s also a great way to stay updated about the latest features.